Methods and systems for target tracking

ABSTRACT

A method includes obtaining an image frame captured by an imaging device carried by an unmanned vehicle and containing the target object, extracting one or more features of a target object from a region selected by a user on the image frame, and determining whether the target object is a predetermined recognizable object type based on a comparison of the one or more features with one or more characteristics associated with the predetermined recognizable object type. In response to the target object being the predetermined recognizable object type, tracking functions associated with the predetermined recognizable object type are initiated. In response to the target object not belonging to the predetermined recognizable object type, tracking functions associated with a general object type are initiated.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of application Ser. No. 16/116,169,filed on Aug. 29, 2018, which is a continuation of InternationalApplication No. PCT/CN2016/075247, filed on Mar. 1, 2016, the entirecontents of both of which are incorporated herein by reference.

TECHNICAL FIELD

The disclosed embodiments relate generally to target tracking and moreparticularly, but not exclusively, to initialization for automatictarget tracking using characteristics associated with a recognizableobject type.

BACKGROUND

Movable objects such as unmanned aerial vehicles (UAVs) can be used forperforming surveillance, reconnaissance, and exploration tasks formilitary and civilian applications. A movable object may carry a payloadconfigured to perform a specific function, such as capturing images ofthe surrounding environment or tracking a specific target. For example,a movable object may track an object moving along the ground or throughthe air. Movement control information for controlling a movable objectis typically received by the movable object from a remote device and/ordetermined by the movable object.

Before a UAV starts to track a target, an initialization process may beperformed to ensure that one or more conditions are optimized forautomatically tracking the target. Various methods may be used forimproving the initialization process.

SUMMARY

There is a need for systems and methods for improved target tracking.Such systems and methods optionally complement or replace conventionalmethods for target tracking.

In accordance with some embodiments, a method for tracking a targetobject includes obtaining a first image frame captured by an imagingdevice borne by an unmanned vehicle, the first image frame containingthe target object. The method extracts one or more features of thetarget object from the first image frame. The target object is within aregion selected by a user on the first image frame. The method alsodetermines whether the target object is a predetermined recognizableobject type based on a comparison of the extracted one or more featureswith one or more characteristics associated with the predeterminedrecognizable object type. In accordance with a determination that thetarget object is a predetermined recognizable object type, trackingfunctions provided in the computing system and associated with thepredetermined recognizable object type are initiated. In accordance witha determination that the target object does not belong to anypredetermined recognizable object type, tracking functions provided inthe computing system and associated with a general object type areinitiated.

In accordance with some embodiments, a system for tracking a targetobject comprises one or more processors; memory; and one or moreprograms, wherein the one or more programs are stored in the memory andconfigured to be executed by the one or more processors, the one or moreprograms including instructions for: obtaining a first image framecaptured by an imaging device borne by an unmanned vehicle, the firstimage frame containing the target object; extracting one or morefeatures of the target object from the first image frame, wherein thetarget object is within a region selected by a user on the first imageframe; determining whether the target object is a predeterminedrecognizable object type based on a comparison of the extracted one ormore features with one or more characteristics associated with thepredetermined recognizable object type; in accordance with adetermination that the target object is a predetermined recognizableobject type, initiating tracking functions provided in the computingsystem and associated with the predetermined recognizable object type;and in accordance with a determination that the target object does notbelong to any predetermined recognizable object type, initiatingtracking functions provided in the computing system and associated witha general object type.

In accordance with some embodiments, a non-transitory computer readablestorage medium stores one or more programs, the one or more programscomprising instructions, which when executed by a movable object, causethe movable object to: obtain a first image frame captured by an imagingdevice borne by an unmanned vehicle, the first image frame containingthe target object; extract one or more features of the target objectfrom the first image frame, wherein the target object is within a regionselected by a user on the first image frame; determine whether thetarget object is a predetermined recognizable object type based on acomparison of the extracted one or more features with one or morecharacteristics associated with the predetermined recognizable objecttype; in accordance with a determination that the target object is apredetermined recognizable object type, initiate tracking functionsprovided in the computing system and associated with the predeterminedrecognizable object type; and in accordance with a determination thatthe target object does not belong to any predetermined recognizableobject type, initiate tracking functions provided in the computingsystem and associated with a general object type.

In accordance with some embodiments, an unmanned aerial vehicle (UAV)comprises: a propulsion system and one or more sensors. The UAV isconfigured to: obtain a first image frame captured by an imaging deviceborne by an unmanned vehicle, the first image frame containing thetarget object; extract one or more features of the target object fromthe first image frame, wherein the target object is within a regionselected by a user on the first image frame; determine whether thetarget object is a predetermined recognizable object type based on acomparison of the extracted one or more features with one or morecharacteristics associated with the predetermined recognizable objecttype; in accordance with a determination that the target object is apredetermined recognizable object type, initiate tracking functionsprovided in the computing system and associated with the predeterminedrecognizable object type; and in accordance with a determination thatthe target object does not belong to any predetermined recognizableobject type, initiate tracking functions provided in the computingsystem and associated with a general object type.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a target tracking system, in accordance with someembodiments.

FIG. 2A illustrates an exemplary movable object in a target trackingsystem, in accordance with some embodiments.

FIG. 2B illustrates an exemplary carrier of a movable object, inaccordance with some embodiments.

FIG. 2C illustrates an exemplary payload of a movable object, inaccordance with some embodiments.

FIG. 3 illustrates an exemplary sensing system of a movable object, inaccordance with some embodiments.

FIG. 4 is a block diagram illustrating an implementation of memory of amovable object, in accordance with some embodiments.

FIG. 5 illustrates an exemplary control unit of a target trackingsystem, in accordance with some embodiments.

FIG. 6 illustrates an exemplary computing device for controlling amovable object, in accordance with some embodiments.

FIG. 7 is a flow diagram illustrating a method for performinginitialization for target tracking, in accordance with some embodiments.

FIG. 8 illustrates an exemplary configuration of a movable object,carrier, and payload, in accordance with some embodiments.

FIG. 9A illustrates an exemplary initialization process for tracking atarget, in accordance with some embodiments.

FIGS. 9B-9C illustrate an image containing a target displayed on a userinterface 950, in accordance with embodiments

FIG. 10A illustrates an exemplary initialization process for tracking atarget, in accordance with some embodiments.

FIG. 10B illustrate an image containing a target displayed on a userinterface, in accordance with embodiments.

FIG. 11 illustrates an exemplary method for determining a pitch angle,in accordance with some embodiments.

FIG. 12 illustrates an exemplary method for determining a pitch angle ofa target 106, in accordance with embodiments.

FIG. 13A illustrates an initialization process for tracking a target, inaccordance with some embodiments.

FIG. 13B illustrate an image containing a target displayed on a userinterface, in accordance with embodiments.

FIG. 14 illustrates an exemplary method for determining a horizontaldistance between a target and a movable object, in accordance withembodiments.

FIG. 15 illustrates an exemplary method for determining a horizontaldistance between a generic target and a movable object, in accordancewith embodiments.

FIGS. 16A-16G are a flow diagram illustrating a method for tracking atarget, in accordance with some embodiments.

DETAILED DESCRIPTION

Reference will now be made in detail to embodiments, examples of whichare illustrated in the accompanying drawings. In the following detaileddescription, numerous specific details are set forth in order to providea thorough understanding of the various described embodiments. However,it will be apparent to one of ordinary skill in the art that the variousdescribed embodiments may be practiced without these specific details.In other instances, well-known methods, procedures, components,circuits, and networks have not been described in detail so as not tounnecessarily obscure aspects of the embodiments.

The following description uses an unmanned aerial vehicle (UAV) as anexample of a movable object. UAVs include, e.g., fixed-wing aircraftsand rotary-wing aircrafts such as helicopters, quadcopters, and aircrafthaving other numbers and/or configurations of rotors. It will beapparent to those skilled in the art that other types of movable objectsmay be substituted for UAVs as described below in accordance withembodiments of the disclosure.

The present disclosure provides techniques related to initialization fortarget tracking by UAVs. In some embodiments, a user selects a targetfrom an image displayed on a user interface of the control unit. Forexample, the image is displayed and the input is received via atouchscreen of the control unit. In some embodiments, the systemperforms an initialization of target tracking. The initializationprocess includes feature extractions and target classification. In someembodiments, the system determines whether the target is a predeterminedrecognizable type or a general type. Preset characteristics associatedwith predetermined recognizable type can be used for determining whetherthe UAV is ready for automatic target tracking. Preset characteristicsassociated with predetermined recognizable type can also be used foradjusting one or more control parameters for controlling the UAV, thecarrier, and/or the imaging device. In some embodiments, when theinitialization is completed, the control unit and/or UAV manageoperations associated with target tracking. In some embodiments, onlyimage data of the target is used for performing the initializationprocess. In this manner, the system can track target which does notinclude a position measuring unit, e.g., a GPS to provide positioninformation of the target.

FIG. 1 illustrates a target tracking system 100, in accordance withvarious embodiments of the present disclosure. Target tracking system100 includes a movable object 102 and a control unit 104. In someembodiments, target tracking system 100 is used to track target 106and/or initiate tracking target 106.

In some embodiments, target 106 includes natural and/or man-made objectssuch geographical landscapes (e.g., mountains, vegetation, valleys,lakes, and/or rivers), buildings, and/or vehicles (e.g., aircrafts,ships, cars, trucks, buses, vans, and/or motorcycles). In someembodiments, the target 106 includes live subjects such as people and/oranimals. In some embodiments, target 106 is moving, e.g., movingrelative to a reference frame (such as the Earth and/or movable object102). In some embodiments, target 106 is static. In some embodiments,target 106 includes an active target system that transmits informationabout target 106, such as the target's GPS location, to movable object102, control unit 104, and/or computing device 126. For example,information is transmitted to movable object 102 via wirelesscommunication from a communication unit of the active target tocommunication system 120 (shown in FIG. 2A) of movable object 102.Active targets include, e.g., friendly vehicles, buildings, and/ortroops. In some embodiments, target 106 includes a passive target (e.g.,that does not transmit information about target 106). Passive targetsinclude, e.g., neutral or hostile vehicles, buildings, and/or troops.

In some embodiments, movable object 102 is configured to communicatewith control unit 104, e.g., via wireless communications 124. Forexample, movable object 102 receives control instructions from controlunit 104 and/or sends data (e.g., data from movable object sensingsystem 122 (shown in FIG. 2A)) to control unit 104.

Control instructions include, e.g., navigation instructions forcontrolling navigational parameters of movable object 102 such asposition, orientation, attitude, and/or one or more movementcharacteristics of movable object 102, carrier 108, and/or payload 110.In some embodiments, control instructions include instructions directingmovement of one or more of movement mechanisms 114 (shown in FIG. 2A).For example, control instructions are used to control flight of a UAV.In some embodiments, control instructions include information forcontrolling operations (e.g., movement) of carrier 108. For example,control instructions are used to control an actuation mechanism ofcarrier 108 so as to cause angular and/or linear movement of payload 110relative to movable object 102. In some embodiments, controlinstructions are used to adjust one or more operational parameters forpayload 110, such as instructions for capturing one or more images,capturing video, adjusting a zoom level, powering on or off, adjustingan imaging mode (e.g., capturing still images or capturing video),adjusting an image resolution, adjusting a focus, adjusting a viewingangle, adjusting a field of view, adjusting a depth of field, adjustingan exposure time, adjusting a shutter speed, adjusting a lens speed,adjusting an ISO, changing a lens and/or moving payload 110 (and/or apart of payload 110, such as imaging device 214 (shown in FIG. 2C)). Insome embodiments, the control instructions are used to controlcommunication system 120, sensing system 122, and/or another componentof movable object 102.

In some embodiments, control instructions from control unit 104 includetarget information, as described further below with regard to FIG. 7.

In some embodiments, movable object 102 is configured to communicatewith computing device 126. For example, movable object 102 receivescontrol instructions from computing device 126 and/or sends data (e.g.,data from movable object sensing system 122) to computing device 126. Insome embodiments, communications from computing device 126 to movableobject 102 are transmitted from computing device 126 to cell tower 130(e.g., via internet 128) and from cell tower 130 to movable object 102(e.g., via RF signals). In some embodiments, a satellite is used in lieuof or in addition to cell tower 130.

In some embodiments, target tracking system includes additional controlunits 104 and/or computing devices 126 configured to communicate withmovable object 102.

FIG. 2A illustrates an exemplary movable object 102 in target trackingsystem 100, in accordance with some embodiments. In some embodiments,one or more components of movable object, such as processor(s) 116,memory 118, communication system 120, and sensing system 122, areconnected by data connections, such as a control bus 112. A control busoptionally includes circuitry (sometimes called a chipset) thatinterconnects and controls communications between system components.

Movable object 102 typically includes one or more processing units 116,memory 118, one or more network or other communications interfaces 120,sensing system 122, and one or more communication buses 112 forinterconnecting these components. In some embodiments, movable object102 is a UAV. Although movable object 102 is depicted as an aircraft,this depiction is not intended to be limiting, and any suitable type ofmovable object can be used.

In some embodiments, movable object 102 includes movement mechanisms 114(e.g., propulsion mechanisms). Although the plural term “movementmechanisms” is used herein for convenience of reference, “movementmechanisms 114” refers to a single movement mechanism (e.g., a singlepropeller) or multiple movement mechanisms (e.g., multiple rotors).Movement mechanisms 114 include one or more movement mechanism typessuch as rotors, propellers, blades, engines, motors, wheels, axles,magnets, nozzles, animals, and/or human beings. Movement mechanisms 114are coupled to movable object 102 at, e.g., the top, bottom, front,back, and/or sides. In some embodiments, movement mechanisms 114 of asingle movable object 102 include multiple movement mechanisms eachhaving the same type. In some embodiments, movement mechanisms 114 of asingle movable object 102 include multiple movement mechanisms havingdifferent movement mechanism types. Movement mechanisms 114 are coupledto movable object 102 (or vice-versa) using any suitable means, such assupport elements (e.g., drive shafts) or other actuating elements (e.g.,actuators 132). For example, an actuator 132 receives control signalsfrom processor(s) 116 (e.g., via control bus 112) that activates theactuator to cause movement of a movement mechanism 114. For example,processor(s) 116 include an electronic speed controller that providescontrol signals to actuators 132.

In some embodiments, the movement mechanisms 114 enable movable object102 to take off vertically from a surface or land vertically on asurface without requiring any horizontal movement of movable object 102(e.g., without traveling down a runway). In some embodiments, movementmechanisms 114 are operable to permit movable object 102 to hover in theair at a specified position and/or orientation. In some embodiments, oneor more of the movement mechanisms 114 are controllable independently ofone or more of the other movement mechanisms 114. For example, whenmovable object 102 is a quadcopter, each rotor of the quadcopter iscontrollable independently of the other rotors of the quadcopter. Insome embodiments, multiple movement mechanisms 114 are configured forsimultaneous movement.

In some embodiments, movement mechanisms 114 include multiple rotorsthat provide lift and/or thrust to movable object. The multiple rotorsare actuated to provide, e.g., vertical takeoff, vertical landing, andhovering capabilities to movable object 102. In some embodiments, one ormore of the rotors spin in a clockwise direction, while one or more ofthe rotors spin in a counterclockwise direction. For example, the numberof clockwise rotors is equal to the number of counterclockwise rotors.In some embodiments, the rotation rate of each of the rotors isindependently variable, e.g., for controlling the lift and/or thrustproduced by each rotor, and thereby adjusting the spatial disposition,velocity, and/or acceleration of movable object 102 (e.g., with respectto up to three degrees of translation and/or up to three degrees ofrotation).

In some embodiments, carrier 108 is coupled to movable object 102. Apayload 110 is coupled to carrier 108. In some embodiments, carrier 108includes one or more mechanisms that enable payload 110 to move relativeto movable object 102, as described further with reference to FIG. 2B.In some embodiments, payload 110 is rigidly coupled to movable object102 such that payload 110 remains substantially stationary relative tomovable object 102. For example, carrier 108 is coupled to payload 110such that payload is not movable relative to movable object 102. In someembodiments, payload 110 is coupled to movable object 102 withoutrequiring carrier 108.

Communication system 120 enables communication with control unit 104and/or computing device 126, e.g., via wireless signals 124. Thecommunication system 120 includes, e.g., transmitters, receivers, and/ortransceivers for wireless communication. In some embodiments, thecommunication is one-way communication, such that data is transmittedonly from movable object 102 to control unit 104, or vice-versa. In someembodiments, communication is two-way communication, such that data istransmitted in both directions between movable object 102 and controlunit 104.

In some embodiments, movable object 102 communicates with computingdevice 126. In some embodiments, movable object 102, control unit 104,and/or the remote device are connected to the Internet or othertelecommunications network, e.g., such that data generated by movableobject 102, control unit 104, and/or computing device 126 is transmittedto a server for data storage and/or data retrieval (e.g., for display bya website).

In some embodiments, sensing system 122 of movable object 102 includesone or more sensors, as described further with reference to FIG. 3. Insome embodiments, movable object 102 and/or control unit 104 use sensingdata generated by sensors of sensing system 122 to determine informationsuch as a position of movable object 102, an orientation of movableobject 102, movement characteristics of movable object 102 (e.g.,angular velocity, angular acceleration, translational velocity,translational acceleration and/or direction of motion along one or moreaxes), proximity of movable object 102 to potential obstacles, weatherconditions, locations of geographical features and/or locations ofmanmade structures.

FIG. 2B illustrates an exemplary carrier 108 in a target tracking system100, in accordance with embodiments. In some embodiments, carrier 108couples a payload 110 to a movable object 102.

In some embodiments, carrier 108 includes a frame assembly including oneor more frame members 202. In some embodiments, frame member 202 iscoupled with movable object 102 and payload 110. In some embodiments,frame member 202 supports payload 110.

In some embodiments, carrier 108 includes one or more mechanisms, suchas one or more actuators 204, to cause movement of carrier 108 and/orpayload 110. Actuator 204 is, e.g., a motor, such as a hydraulic,pneumatic, electric, thermal, magnetic, and/or mechanical motor. In someembodiments, actuator 204 causes movement of frame member 202. In someembodiments, actuator 204 rotates payload 110 about one or more axes,such as three axes: X axis (“pitch axis”), Z axis (“roll axis”), and Yaxis (“yaw axis”), relative to movable object 102. In some embodiments,actuator 204 translates payload 110 along one or more axes relative tomovable object 102.

In some embodiments, carrier 108 includes one or more carrier sensingsystem 206, e.g., for determining a state of carrier 108 or payload 110.Carrier sensing system 206 includes, e.g., motion sensors (e.g.,accelerometers), rotation sensors (e.g., gyroscopes), potentiometers,and/or inertial sensors. In some embodiments, carrier sensing system 206includes one or more sensors of movable object sensing system 122 asdescribed below with regard to FIG. 3. Sensor data determined by carriersensing system 206 includes, e.g., spatial disposition (e.g., position,orientation, or attitude) and/or movement information such as velocity(e.g., linear or angular velocity) and/or acceleration (e.g., linear orangular acceleration) of carrier 108 and/or payload 110. In someembodiments, sensing data and/or state information calculated from thesensing data are used as feedback data to control the movement of one ormore components (e.g., frame member 202, actuator 204, and/or dampingelement 208) of carrier 108. Carrier sensor 206 is coupled to, e.g.,frame member 202, actuator 204, damping element 208, and/or payload 110.In an embodiment, carrier sensor 206 (e.g., a potentiometer) measuresmovement of actuator 204 (e.g., the relative positions of a motor rotorand a motor stator) and generates a position signal representative ofthe movement of the actuator 204 (e.g., a position signal representativeof relative positions of the motor rotor and the motor stator). In someembodiments, data generated by a carrier sensor 206 is received byprocessor(s) 116 and/or memory 118 of movable object 102.

In some embodiments, the coupling of carrier 108 to movable object 102includes one or more damping elements 208. Damping elements 208 areconfigured to reduce or eliminate movement of the load (e.g., payload110 and/or carrier 108) caused by movement of movable object 102.Damping elements 208 include, e.g., active damping elements, passivedamping elements, and/or hybrid damping elements having both active andpassive damping characteristics. The motion damped by the dampingelements 208 can include one or more of vibrations, oscillations,shaking, or impacts. Such motions may originate from motions of movableobject that are transmitted to the load. For example, the motion mayinclude vibrations caused by the operation of a propulsion system and/orother components of a movable object 101.

In some embodiments, a damping element 208 provides motion damping byisolating the load from the source of unwanted motion by dissipating orreducing the amount of motion transmitted to the load (e.g., vibrationisolation). In some embodiments, damping element 208 reduces themagnitude (e.g., amplitude) of the motion that would otherwise beexperienced by the load. In some embodiments the motion damping appliedby a damping element 208 is used to stabilize the load, therebyimproving the quality of images captured by the load (e.g., imagecapturing device or imaging device), as well as reducing thecomputational complexity of image stitching steps required to generate apanoramic image based on the captured images.

Damping element 208 described herein can be formed from any suitablematerial or combination of materials, including solid, liquid, orgaseous materials. The materials used for the damping elements may becompressible and/or deformable. For example, the damping element 208 ismade of, e.g. sponge, foam, rubber, gel, and the like. For example,damping element 208 includes rubber balls that are substantiallyspherical in shape. The damping element 208 is, e.g., substantiallyspherical, rectangular, and/or cylindrical. In some embodiments, dampingelement 208 includes piezoelectric materials or shape memory materials.In some embodiments, damping elements 208 include one or more mechanicalelements, such as springs, pistons, hydraulics, pneumatics, dashpots,shock absorbers, isolators, and the like. In some embodiments,properties of the damping element 208 are selected so as to provide apredetermined amount of motion damping. In some instances, the dampingelement 208 has viscoelastic properties. The properties of dampingelement 208 are, e.g., isotropic or anisotropic. In some embodiments,damping element 208 provides motion damping equally along all directionsof motion. In some embodiments, damping element 208 provides motiondamping only along a subset of the directions of motion (e.g., along asingle direction of motion). For example, the damping element 208 mayprovide damping primarily along the Y (yaw) axis. In this manner, theillustrated damping element 208 reduces vertical motions.

In some embodiments, carrier 108 includes controller 210. Controller 210includes, e.g., one or more controllers and/or processors. In someembodiments, controller 210 receives instructions from processor(s) 116of movable object 102. For example, controller 210 is connected toprocessor(s) 116 via control bus 112. In some embodiments, controller210 controls movement of actuator 204, adjusts one or more parameters ofcarrier sensor 206, receives data from carrier sensor 206, and/ortransmits data to processor 116.

FIG. 2C illustrates an exemplary payload 110 in a target tracking system100, in accordance with some embodiments. In some embodiments, payload110 includes a payload sensing system 212 and a controller 218. In someembodiments, payload sensing system 212 includes an imaging device 214,such as a camera. In some embodiments, payload sensing system 212includes one or more sensors of movable object sensing system 122 asdescribed below with regard to FIG. 3.

Payload sensing system 212 generates static sensing data (e.g., a singleimage captured in response to a received instruction) and/or dynamicsensing data (e.g., a series of images captured at a periodic rate, suchas a video). Imaging device 214 includes, e.g., an image sensor 216 todetect light (such as visible light, infrared light, and/or ultravioletlight). In some embodiments, imaging device 214 includes one or moreoptical devices (e.g., lenses) to focus or otherwise alter the lightonto image sensor 216.

In some embodiments, image sensors 216 includes, e.g., semiconductorcharge-coupled devices (CCD), active pixel sensors using complementarymetal-oxide-semiconductor (CMOS) or N-type metal-oxide-semiconductor(NMOS, Live MOS) technologies, or any other types of sensors. Imagesensor 216 and/or imaging device 214 capture, e.g., images and/or imagestreams (e.g., videos). Adjustable parameters of imaging device 214include, e.g., width, height, aspect ratio, pixel count, resolution,quality, imaging mode, focus distance, depth of field, exposure time,shutter speed and/or lens configuration. In some embodiments, imagingdevice 214 is configured to capture high-definition orultra-high-definition videos (e.g., 720p, 1080i, 1080p, 1440p, 2000p,2160p, 2540p, 4000p, 4320p, and so on).

In some embodiments, payload 110 includes controller 218. Controller 218includes, e.g., one or more controllers and/or processors. In someembodiments, controller 218 receives instructions from processor(s) 116of movable object 102. For example, controller 218 is connected toprocessor(s) 116 via control bus 112. In some embodiments, controller218 adjusts one or more parameters of one or more sensors of payloadsensing system 212; receives data from one or more sensors of payloadsensing system 212; and/or transmits data, such as image data from imagesensor 216, to processor 116, memory 118, and/or control unit 104.

In some embodiments, data generated by one or more sensors of payloadsensor system 212 is stored, e.g., by memory 118. In some embodiments,data generated by payload sensor system 212 are transmitted to controlunit 104 (e.g., via communication system 120). For example, video isstreamed from payload 110 (e.g., imaging device 214) to control unit104. In this manner, control unit 104 displays, e.g., real-time (orslightly delayed) video received from imaging device 214.

In some embodiments, adjustment of the orientation, position, attitude,and/or one or more movement characteristics of movable object 102,carrier 108, and/or payload 110 is generated based at least in part onconfigurations (e.g., preset and/or user configured in systemconfiguration 400) of movable object 102, carrier 108, and/or payload110. For example, adjustment that involves rotation around two axes(e.g., yaw and pitch) is achieved solely by corresponding rotation ofmovable object around the two axes if payload 110 including imagingdevice 214 is rigidly coupled to movable object 102 (and hence notmovable relative to movable object 102) and/or payload 110 is coupled tomovable object 102 via a carrier 108 that does not permit relativemovement between imaging device 214 and movable object 102. The sametwo-axis adjustment is achieved by, e.g., combining adjustment of bothmovable object 102 and carrier 108 if carrier 108 permits imaging device214 to rotate around at least one axis relative to movable object 102.In this case, carrier 108 can be controlled to implement the rotationaround one or two of the two axes required for the adjustment andmovable object 120 can be controlled to implement the rotation aroundone or two of the two axes. For example, carrier 108 includes, e.g., aone-axis gimbal that allows imaging device 214 to rotate around one ofthe two axes required for adjustment while the rotation around theremaining axis is achieved by movable object 102. In some embodiments,the same two-axis adjustment is achieved by carrier 108 alone whencarrier 108 permits imaging device 214 to rotate around two or more axesrelative to movable object 102. For example, carrier 108 includes atwo-axis or three-axis gimbal.

FIG. 3 illustrates an exemplary sensing system 122 of a movable object102, in accordance with some embodiments. In some embodiments, one ormore sensors of movable object sensing system 122 are mounted to theexterior, located within, or otherwise coupled to movable object 102. Insome embodiments, one or more sensors of movable object sensing systemare components of carrier sensing system 206 and/or payload sensingsystem 212. Where sensing operations are described as being performed bymovable object sensing system 122 herein, it will be recognized thatsuch operations are optionally performed by carrier sensing system 206and/or payload sensing system 212.

Movable object sensing system 122 generates static sensing data (e.g., asingle image captured in response to a received instruction) and/ordynamic sensing data (e.g., a series of images captured at a periodicrate, such as a video).

In some embodiments, movable object sensing system 122 includes one ormore image sensors 302, such as image sensor 308 (e.g., a leftstereographic image sensor) and/or image sensor 310 (e.g., a rightstereographic image sensor). Image sensors 302 capture, e.g., images,image streams (e.g., videos), stereographic images, and/or stereographicimage streams (e.g., stereographic videos). Image sensors 302 detectlight, such as visible light, infrared light, and/or ultraviolet light.In some embodiments, movable object sensing system 122 includes one ormore optical devices (e.g., lenses) to focus or otherwise alter thelight onto one or more image sensors 302. In some embodiments, imagesensors 302 include, e.g., semiconductor charge-coupled devices (CCD),active pixel sensors using complementary metal-oxide-semiconductor(CMOS) or N-type metal-oxide-semiconductor (NMOS, Live MOS)technologies, or any other types of sensors.

In some embodiments, movable object sensing system 122 includes one ormore audio transducers 304. For example, an audio detection systemincludes audio output transducer 312 (e.g., a speaker), and audio inputtransducer 314 (e.g. a microphone, such as a parabolic microphone). Insome embodiments, microphone and a speaker are used as components of asonar system. In some embodiments, a sonar system is used to track atarget object, e.g., by detecting location information of a targetobject.

In some embodiments, movable object sensing system 122 includes one ormore infrared sensors 306. In some embodiments, a distance measurementsystem includes a pair of infrared sensors e.g., infrared sensor 316(such as a left infrared sensor) and infrared sensor 318 (such as aright infrared sensor) or another sensor or sensor pair. The distancemeasurement system is used to, e.g., measure a distance to a target 106.

In some embodiments, a system to produce a depth map includes one ormore sensors or sensor pairs of movable object sensing system 122 (suchas left stereographic image sensor 308 and right stereographic imagesensor 310; audio output transducer 312 and audio input transducer 314;and/or left infrared sensor 316 and right infrared sensor 318. In someembodiments, a pair of sensors in a stereo data system (e.g., astereographic imaging system) simultaneously captures data fromdifferent positions. In some embodiments, a depth map is generated by astereo data system using the simultaneously captured data. In someembodiments, a depth map is used for positioning and/or detectionoperations, such as detecting a target object 106, and/or detectingcurrent location information of a target object 106.

In some embodiments, movable object sensing system 122 includes one ormore global positioning system (GPS) sensors, motion sensors (e.g.,accelerometers), rotation sensors (e.g., gyroscopes), inertial sensors,proximity sensors (e.g., infrared sensors) and/or weather sensors (e.g.,pressure sensor, temperature sensor, moisture sensor, and/or windsensor).

In some embodiments, sensing data generated by one or more sensors ofmovable object sensing system 122 and/or information determined usingsensing data from one or more sensors of movable object sensing system122 are transmitted to control unit 104 (e.g., via communication system120). In some embodiments, data generated one or more sensors of movableobject sensing system 122 and/or information determined using sensingdata from one or more sensors of movable object sensing system 122 isstored by memory 118.

FIG. 4 is a block diagram illustrating an implementation of memory 118,in accordance with some embodiments. In some embodiments, one or moreelements illustrated in FIG. 4 are located in control unit 104,computing device 126, and/or another device.

In some embodiments, memory 118 stores a system configuration 400.System configuration 400 includes one or more system settings (e.g., asconfigured by a manufacturer, administrator, and/or user). For example,a constraint on one or more of orientation, position, attitude, and/orone or more movement characteristics of movable object 102, carrier 108,and/or payload 110 is stored as a system setting of system configuration400.

In some embodiments, memory 118 stores a motion control module 402.Motion control module stores, e.g., control instructions, such ascontrol instructions received from control module 104 and/or computingdevice 126. Control instructions are used for, e.g., controllingoperation of movement mechanisms 114, carrier 108, and/or payload 110.

In some embodiments, memory 118 stores a tracking module 404. In someembodiments, tracking module 404 generates tracking information fortarget 106 that is being tracked by movable object 102. In someembodiments, tracking information is generated based on images capturedby imaging device 214 and/or output from image analysis module 406(e.g., after pre-processing and/or processing operations have beenperformed on one or more images). Alternatively or in combination,tracking information is generated based on analysis of gestures of ahuman target. The gestures are captured by imaging device 214 and/oranalyzed by gesture analysis module 403. Tracking information generatedby tracking module 404 includes, for example, location, size, or othercharacteristics of target 106 within one or more images. In someembodiments, tracking information generated by tracking module 404 istransmitted to control unit 104 and/or computing device 126 (e.g.,augmenting or otherwise combined with images and/or output from imageanalysis module 406). For example, tracking information is transmittedto control unit 104 in response to a request from control unit 104and/or on a periodic basis.

In some embodiments, memory 118 includes an image analysis module 406.Image analysis module 406 performs processing operations on images, suchas images captured by imaging device 214. In some embodiments, imageanalysis module performs pre-processing on raw image data, such asre-sampling to assure the correctness of the image coordinate system,noise reduction, contrast enhancement, and/or scale spacerepresentation. In some embodiments, processing operations performed onimage data (including image data that has been pre-processed) includefeature extraction, image segmentation, data verification, imagerecognition, image registration, and/or image matching. In someembodiments, output from image analysis module 406 after pre-processingand/or processing operations have been performed on one or more imagesis transmitted to control unit 104. In some embodiments, featureextraction is performed by control unit 104, processor(s) 116 of movableobject 102, and/or computing device 126. In some embodiments, imageanalysis module 406 may use neural network to perform image recognitionand/or classification of object(s) included in the image. For example,by comparing features extracted from target 106 included in the imagewith characteristics of one or more predetermined recognizable targetobject types, image analysis module 406 may recognize target 106 to be acertain predetermined recognizable target object type, e.g., a human.

In some embodiments, memory 118 includes a gesture analysis module 403.Gesture analysis module 403 performs processing operations on gesturesof one or more human targets. The gestures may be captured by imagingdevice 214. Gesture analysis results may be fed to tracking module 404and/or motion control module 402 for generating tracking informationand/or control instructions respectively for controlling operations ofmovement mechanisms 114, carrier 108, and/or payload 110 of movableobject 102. In some embodiments, a calibration process may be performedbefore using gestures of a human target to control movable object 102.For example, during the calibration process, gesture analysis module 403captures certain features of human gestures associated with a certaincontrol command and stores the gesture features in memory 118. When ahuman gesture is received, gesture analysis module 403 may extractfeatures of the human gesture and compare with the stored features todetermine whether the certain command is performed by the user. Thecorrelations between gestures and control commands associated with acertain human target may or may not be different from such correlationsassociated with another human target.

In some embodiments, memory 118 includes a spatial relationshipdetermination module 405. Spatial relationship determination module 405calculates one or more spatial relationships between target 106 andmovable object 102. In some embodiments, the spatial relationshipsbetween target 106 and movable object 102 include a horizontal distancebetween target 106 and movable object 102 and/or a pitch angle betweentarget 106 and movable object 102.

In some embodiments, memory 118 stores target information 408. In someembodiments, target information 408 is received by movable object 102(e.g., via communication system 120) from control unit 104, computingdevice 126, target 106, and/or another movable object 102.

In some embodiments, target information 408 includes a time value and/orexpiration time indicating a period of time during which the target 106is to be tracked. In some embodiments, target information 408 includes aflag indicating whether a targeting information entry includes specifictarget information 412 and/or target type information 410.

In some embodiments, target information 408 includes target typeinformation 410 such as color, texture, pattern, size, shape, and/ordimension. In some embodiments, target type information includes, but isnot limited to, a predetermined recognizable object type and a generalobject type as identified by image analysis module 406. In someembodiments, target type information 410 includes features orcharacteristics for each type of target and is preset and stored inmemory 118. In some embodiments, target type information 410 is, e.g.,provided by a user to a user input device, such as a user input deviceof control unit 104. In some embodiments, the user may select apre-existing target pattern or type (e.g., a black object or a roundobject with a radius greater or less than a certain value).

In some embodiments, target information 408 includes tracked targetinformation 412 for a specific target 106 being tracked. Targetinformation 408 may be identified by image analysis module 406 byanalyzing the target in a capture image. Tracked target information 412includes, e.g., an image of target 106, an initial position (e.g.,location coordinates, such as pixel coordinates within an image) oftarget 106, and/or a size of target 106 within one or more images (e.g.,images captured by imaging device 214 of payload 110). A size of target106 is stored, e.g., as a length (e.g., mm or other length unit), anarea (e.g., mm² or other area unit), a number of pixels in a line (e.g.,indicating a length, width, and/or diameter), a ratio of a length of arepresentation of the target in an image relative to a total imagelength (e.g., a percentage), a ratio of an area of a representation ofthe target in an image relative to a total image area (e.g., apercentage), a number of pixels indicating an area of target 106, and/ora corresponding spatial relationship (e.g., a vertical distance and/or ahorizontal distance) between target 106 and movable object 102 (e.g., anarea of target 106 changes based on a distance of target 106 frommovable object 102).

In some embodiments, one or more features (e.g., characteristics) oftarget 106 are determined from an image of target 106 (e.g., using imageanalysis techniques on images captured by imaging device 112). Forexample, one or more features of target 106 are determined from anorientation and/or part or all of identified boundaries of target 106.In some embodiments, tracked target information 412 includes pixelcoordinates and/or pixel counts to indicate, e.g., a size parameter,position, and/or shape of a target 106. In some embodiments, one or morefeatures of the tracked target information 412 are to be maintained asmovable object 102 tracks target 106 (e.g., the tracked targetinformation 412 are to be maintained as images of target 106 arecaptured by imaging device 214). Tracked target information 412 is used,e.g., to adjust movable object 102, carrier 108, and/or imaging device214, e.g., such that the specified features of target 106 aresubstantially maintained. In some embodiments, tracked targetinformation 412 is determined based on one or more of target type 410.

In some embodiments, memory 118 also includes predetermined recognizabletarget type information 414. Predetermined recognizable target typeinformation 414 specifies one or more characteristics of a certainpredetermined recognizable target type (e.g., type 1, type 2 . . . typen). Each predetermined recognizable target type may include one or morecharacteristics such as a size parameter (e.g., area, diameter, height,length and/or width), position (e.g., relative to an image center and/orimage boundary), movement (e.g., speed, acceleration, altitude) and/orshape. In one example, type 1 may be a human target. One or morecharacteristics associated with a human target may include a height in arange from about 1.5 meters to about 2 meters, a pattern comprising ahuman head, a human torso, and human limbs, and/or a moving speed in arange from about 2 kilometers/hour to about 25 kilometers/hour. Inanother example, type 2 may be a car target. One or more characteristicsassociated with a car target may include a height in a range from about1.4 meters to about 4.5 meters, a length in a range from about 3 metersto about 10 meters, a moving speed of 5 kilometers/hour to about 140kilometers/hour, and/or a pattern of a sedan, a SUV, a truck, or a bus.In yet another example, type 3 may be a ship target. Other types ofpredetermined recognizable target object may also include airplanetarget, animal target, etc. Each type may further include one or moresubtypes that include more specific characteristics for each subtype.The characteristics of each subtype may provide more accurate targetclassification results.

In some embodiments, target information 408 (including, e.g., targettype information 410 and information for a tracked target 412), and/orpredetermined recognizable target information 414 is generated based onuser input, such as input received at user input device 506 (shown inFIG. 5) of control unit 104. Additionally or alternatively, targetinformation is generated based on data from sources other than controlunit 104. For example, target type information 410 may be based onstored previous images of target 106 (e.g., images captured by imagingdevice 214 and stored by memory 118), other data stored by memory 118,and/or data from data stores that are remote from control unit 104and/or movable object 102. In some embodiments, target type information410 is generated using a computer-generated image of target 106.

In some embodiments, target information 408 is used by movable object102 to track target 106. For example, target information 408 is used bytracking module 404. In some embodiments, target information 408 is usedby an image analysis module 406 to identify and/or classify target 106.In some cases, target identification involves image recognition and/ormatching algorithms based on, e.g., CAD-like object models,appearance-based methods, feature-based methods, and/or geneticalgorithms. In some embodiments, target identification includescomparing two or more images to determine, extract, and/or matchfeatures contained therein.

The above identified modules or programs (i.e., sets of instructions)need not be implemented as separate software programs, procedures ormodules, and thus various subsets of these modules may be combined orotherwise re-arranged in various embodiments. In some embodiments,memory 118 may store a subset of the modules and data structuresidentified above. Furthermore, memory 118 may store additional modulesand data structures not described above. In some embodiments, theprograms, modules, and data structures stored in memory 118, or anon-transitory computer readable storage medium of memory 118, provideinstructions for implementing respective operations in the methodsdescribed below. In some embodiments, some or all of these modules maybe implemented with specialized hardware circuits that subsume part orall of the module functionality. One or more of the above identifiedelements may be executed by one or more processors 116 of movable object102. In some embodiments, one or more of the above identified elementsis executed by one or more processors of a device remote from movableobject 102, such as control unit 104 and/or computing device 126.

FIG. 5 illustrates an exemplary control unit 104 of target trackingsystem 100, in accordance with some embodiments. In some embodiments,control unit 104 communicates with movable object 102 via communicationsystem 120, e.g., to provide control instructions to movable object 102.Although control unit 104 is typically a portable (e.g., handheld)device, control unit 104 need not be portable. In some embodiments,control unit 104 is a dedicated control device (e.g., dedicated tooperation of movable object 102), a laptop computer, a desktop computer,a tablet computer, a gaming system, a wearable device (e.g., watches,glasses, gloves, and/or helmet), a microphone, and/or a combinationthereof.

Control unit 104 typically includes one or more processing units 502, acommunication system 510 (e.g., including one or more network or othercommunications interfaces), memory 504, one or more input/output (I/O)interfaces (e.g., display 506 and/or input device 508) and one or morecommunication buses 512 for interconnecting these components.

In some embodiments, a touchscreen display includes display 508 andinput device 506. A touchscreen display optionally uses LCD (liquidcrystal display) technology, LPD (light emitting polymer display)technology, or LED (light emitting diode) technology, although otherdisplay technologies are used in other embodiments. A touchscreendisplay and processor(s) 502 optionally detect contact and any movementor breaking thereof using any of a plurality of touch sensingtechnologies now known or later developed, including but not limited tocapacitive, resistive, infrared, and surface acoustic wave technologies,as well as other proximity sensor arrays or other elements fordetermining one or more points of contact with the touchscreen display.

In some embodiments, input device 506 includes, e.g., one or morejoysticks, switches, knobs, slide switches, buttons, dials, keypads,keyboards, mice, audio transducers (e.g., microphones for voice controlsystems), motion sensors, and/or gesture controls. In some embodiments,an I/O interface of control unit 104 includes sensors (e.g., GPSsensors, and/or accelerometers), audio output transducers (e.g.,speakers), and/or one or more tactile output generators for generatingtactile outputs.

In some embodiments, input device 506 receives user input to controlaspects of movable object 102, carrier 108, payload 110, or a componentthereof. Such aspects include, e.g., attitude, position, orientation,velocity, acceleration, navigation, and/or tracking. For example, inputdevice 506 is manually set by a user to one or more positions, each ofthe positions corresponding to a predetermined input for controllingmovable object 102. In some embodiments, input device 506 is manipulatedby a user to input control instructions for controlling the navigationof movable object 102. In some embodiments, input device 506 is used toinput a flight mode for movable object 102, such as auto pilot ornavigation according to a predetermined navigation path.

In some embodiments, input device 506 is used to input a target trackingmode for movable object 102, such as a manual tracking mode or anautomatic tracking mode. In some embodiments, the user controls movableobject 102, e.g., the position, attitude, and/or orientation of movableobject 102, by changing a position of control unit 104 (e.g., by tiltingor otherwise moving control unit 104). For example, a change in aposition of control unit 104 is detected by, e.g., one or more inertialsensors and output of the one or more inertial sensors is used togenerate command data. In some embodiments, input device 506 is used toadjust an operational parameter of the payload, such as a parameter of apayload sensing system 212 (e.g., to adjust a zoom parameter of imagingdevice 214) and/or a position of payload 110 relative to carrier 108and/or movable object 102.

In some embodiments, input device 506 is used to indicate informationabout target 106, e.g., to select a target 106 to track and/or toindicate target type information 412. In some embodiments, input device506 is used for interaction with augmented image data. For example, animage displayed by display 508 includes representations of one or moretargets 106. In some embodiments, representations of the one or moretargets 106 are augmented to indicate identified objects for potentialtracking and/or a target 106 that is currently being tracked.Augmentation includes, for example, a graphical tracking indicator(e.g., a box) adjacent to or surrounding a respective target 106. Insome embodiments, input device 506 is used to select a target 106 totrack or to change from a target 106 being tracked to a different targetfor tracking. In some embodiments, a target 106 is selected when an areacorresponding to a representation of target 106 is selected by e.g., afinger, stylus, mouse, joystick, or other component of input device 506.In some embodiments, specific target information 412 is generated when auser selects a target 106 to track.

The control unit 104 may also be configured to allow a user to entertarget information using any suitable method. In some embodiments, inputdevice 506 receives a selection of a target 106 from one or more images(e.g., video or snapshot) displayed by display 508. For example, inputdevice 506 receives input including a selection performed by a gesturearound target 106 and/or a contact at a location corresponding to target106 in an image. In some embodiments, computer vision or othertechniques are used to determine a boundary of a target 106. In someembodiments, input received at input device 506 defines a boundary oftarget 106. In some embodiments, multiple targets are simultaneouslyselected. In some embodiments, a selected target is displayed with aselection indicator (e.g., a bounding box) to indicate that the targetis selected for tracking. In some other embodiments, input device 506receives input indicating information such as color, texture, shape,dimension, and/or other characteristics associated with a target 106.For example, input device 506 includes a keyboard to receive typed inputindicating target information 408.

In some embodiments, a control unit 104 provides an interface thatenables a user to select (e.g., using input device 506) between a manualtracking mode and an automatic tracking mode. When the manual trackingmode is selected, the interface enables the user to select a target 106to track. For example, a user is enabled to manually select arepresentation of a target 106 from an image displayed by display 508 ofcontrol unit 104. Specific target information 412 associated with theselected target 106 is transmitted to movable object 102, e.g., asinitial expected target information.

In some embodiments, when the automatic tracking mode is selected, theuser does not provide input selecting a target 106 to track. In someembodiments, input device 506 receives target type information 410 fromuser input. In some embodiments, movable object 102 uses the target typeinformation 410, e.g., to automatically identify the target 106 to betracked and/or to track the identified target 106.

Typically, manual tracking requires more user control of the tracking ofthe target and less automated processing or computation (e.g., image ortarget recognition) by processor(s) 116 of movable object 102, whileautomatic tracking requires less user control of the tracking processbut more computation performed by processor(s) 116 of movable object 102(e.g., by image analysis module 406). In some embodiments, allocation ofcontrol over the tracking process between the user and the onboardprocessing system is adjusted, e.g., depending on factors such as thesurroundings of movable object 102, motion of movable object 102,altitude of movable object 102, system configuration 400 (e.g., userpreferences), and/or available computing resources (e.g., CPU or memory)of movable object 102, control unit 104, and/or computing device 126.For example, relatively more control is allocated to the user whenmovable object is navigating in a relatively complex environment (e.g.,with numerous buildings or obstacles or indoor) than when movable objectis navigating in a relatively simple environment (e.g., wide open spaceor outdoor). As another example, more control is allocated to the userwhen movable object 102 is at a lower altitude than when movable object102 is at a higher altitude. As a further example, more control isallocated to movable object 102 if movable object is equipped with ahigh-speed processor adapted to perform complex computations relativelyquickly. In some embodiments, the allocation of control over thetracking process between user and movable object 102 is dynamicallyadjusted based on one or more of the factors described herein.

In some embodiments, control unit 104 includes an electronic device(e.g., a portable electronic device) and an input device 506 that is aperipheral device that is communicatively coupled (e.g., via a wirelessand/or wired connection) and/or mechanically coupled to the electronicdevice. For example, control unit 104 includes a portable electronicdevice (e.g., a smartphone) and a remote control device (e.g., astandard remote control with a joystick) coupled to the portableelectronic device. In this example, an application executed by thesmartphone generates control instructions based on input received at theremote control device.

In some embodiments, the display device 508 displays information aboutmovable object 102, carrier 108, and/or payload 110, such as position,attitude, orientation, movement characteristics of movable object 102,and/or distance between movable object 102 and another object (e.g.,target 106 and/or an obstacle). In some embodiments, informationdisplayed by display device 508 includes images captured by imagingdevice 214, tracking data (e.g., a graphical tracking indicator appliedto a representation of target 106, such as a box or other shape aroundtarget 106 shown to indicate that target 106 is currently beingtracked), and/or indications of control data transmitted to movableobject 102. In some embodiments, the images including the representationof target 106 and the graphical tracking indicator are displayed insubstantially real-time as the image data and tracking information arereceived from movable object 102 and/or as the image data is acquired.

The communication system 510 enables communication with communicationsystem 120 of movable object 102, communication system 610 of computingdevice 126, and/or a base station (e.g., computing device 126) via awired or wireless communication connection. In some embodiments, thecommunication system 510 transmits control instructions (e.g.,navigation control instructions, target information, and/or trackinginstructions). In some embodiments, the communication system 510receives data (e.g., tracking data from payload imaging device 214,and/or data from movable object sensing system 122). In someembodiments, control unit 104 receives tracking data (e.g., via wirelesscommunications 124) from movable object 102. Tracking data is used bycontrol unit 104 to, e.g., display target 106 as the target is beingtracked. In some embodiments, data received by control unit 104 includesraw data (e.g., raw sensing data as acquired by one or more sensors)and/or processed data (e.g., raw data as processed by, e.g., trackingmodule 404).

In some embodiments, memory 504 stores instructions for generatingcontrol instructions automatically and/or based on input received viainput device 506. The control instructions include, e.g., controlinstructions for operating movement mechanisms 114 of movable object 102(e.g., to adjust the position, attitude, orientation, and/or movementcharacteristics of movable object 102, such as by providing controlinstructions to actuators 132). In some embodiments, the controlinstructions adjust movement of movable object 102 with up to sixdegrees of freedom. In some embodiments, the control instructions aregenerated to initialize and/or maintain tracking of a target 106 (e.g.,as described further with regard to FIG. 7). In some embodiments,control instructions include instructions for adjusting carrier 108(e.g., instructions for adjusting damping element 208, actuator 204,and/or one or more sensors of carrier sensing system 206 of carrier108). In some embodiments, control instructions include instructions foradjusting payload 110 (e.g., instructions for adjusting one or moresensors of payload sensing system 212). In some embodiments, controlinstructions include control instructions for adjusting the operationsof one or more sensors of movable object sensing system 122.

In some embodiments, memory 504 also stores instructions for performingimage recognition, target classification, spatial relationshipdetermination, and/or gesture analysis that are similar to thecorresponding functionalities discussed with regard to FIG. 4. Memorymay also store target information, such as tracked target informationand/or predetermined recognizable target type information as discussedin FIG. 4.

In some embodiments, input device 506 receives user input to control oneaspect of movable object 102 (e.g., the zoom of the imaging device 214)while a control application generates the control instructions foradjusting another aspect of movable object 102 (e.g., to control one ormore movement characteristics of movable object 102). The controlapplication includes, e.g., control module 402, tracking module 404and/or a control application of control unit 104 and/or computing device126. For example, input device 506 receives user input to control one ormore movement characteristics of movable object 102 while the controlapplication generates the control instructions for adjusting a parameterof imaging device 214. In this manner, a user is enabled to focus oncontrolling the navigation of movable object without having to provideinput for tracking the target (e.g., tracking is performed automaticallyby the control application).

In some embodiments, allocation of tracking control between user inputreceived at input device 506 and the control application variesdepending on factors such as, e.g., surroundings of movable object 102,motion of movable object 102, altitude of movable object 102, systemconfiguration (e.g., user preferences), and/or available computingresources (e.g., CPU or memory) of movable object 102, control unit 104,and/or computing device 126. For example, relatively more control isallocated to the user when movable object is navigating in a relativelycomplex environment (e.g., with numerous buildings or obstacles orindoor) than when movable object is navigating in a relatively simpleenvironment (e.g., wide open space or outdoor). As another example, morecontrol is allocated to the user when movable object 102 is at a loweraltitude than when movable object 102 is at a higher altitude. As afurther example, more control is allocated to movable object 102 ifmovable object 102 is equipped with a high-speed processor adapted toperform complex computations relatively quickly. In some embodiments,the allocation of control over the tracking process between user andmovable object is dynamically adjusted based on one or more of thefactors described herein.

FIG. 6 illustrates an exemplary computing device 126 for controllingmovable object 102, in accordance with embodiments. Computing device 126is, e.g., a server computer, laptop computer, desktop computer, tablet,or phone. Computing device 126 typically includes one or more processingunits 602, memory 604, communication system 610 and one or morecommunication buses 612 for interconnecting these components. In someembodiments, computing device 126 includes input/output (I/O) interfaces606, e.g., display 614 and/or input device 616.

In some embodiments, computing device 126 is a base station thatcommunicates (e.g., wirelessly) with movable object 102 and/or controlunit 104.

In some embodiments, computing device 126 provides data storage, dataretrieval, and/or data processing operations, e.g., to reduce theprocessing power and/or data storage requirements of movable object 102and/or control unit 104. For example, computing device 126 iscommunicatively connected to a database (e.g., via communication 610)and/or computing device 126 includes database (e.g., database isconnected to communication bus 612).

Communication system 610 includes one or more network or othercommunications interfaces. In some embodiments, computing device 126receives data from movable object 102 (e.g., from one or more sensors ofmovable object sensing system 122) and/or control unit 104. In someembodiments, computing device 126 transmits data to movable object 102and/or control unit 104. For example, computing device provides controlinstructions to movable object 102.

In some embodiments, memory 604 stores instructions for performing imagerecognition, target classification, spatial relationship determination,and/or gesture analysis that are similar to the correspondingfunctionalities discussed with regard to FIG. 4. Memory 604 may alsostore target information, such as tracked target information and/orpredetermined recognizable target type information as discussed in FIG.4.

FIG. 7 is a flow diagram illustrating a method 700 for performinginitialization for target tracking, in accordance with some embodiments.The method 700 is performed at a device or a system including one ormore devices, such as moving object 102, control unit 104 and/orcomputing device 126.

The system obtains (702) an image frame captured by imaging device 214borne by movable object 102. In some embodiments, imaging device 214captures an image, and target 106 may be identified by a user when theuser views the captured image on a device, e.g., display 508 and/ordisplay 616. For example, the user may tap, circle, click, or use anyother suitable interaction method (e.g., using a gesture) to indicatethe user interest in target 106 contained in the image frame. User inputmay be received from input device 506 and/or input device 614. In someembodiments, more than one target 106 may be contained in the imageframe. In some embodiments, target 106 is displayed with a selectionindicator (e.g., a bounding box) to indicate that the target is selectedby the user.

In some embodiments, a user may first provide some target information,e.g., location information of target 106. Such target information may beprovided using any input device as discussed herein. Movable object 102and/or carrier 108 may be adjusted manually or automatically by motioncontrol module 402 to point imaging device 214 to a direction of target106 based on the provided target information. In some embodiments, oneor more sensors of movable object sensing system 122 may be usedindependently or in combination with imaging device 214 to identifytarget 106. Imaging device 214 may then capture an image containingtarget 106. The user may further identify/confirm target 106 in thecaptured image when the user views the image displayed on a device asdiscussed elsewhere herein.

The system performs (704) target classification. In some embodiments,the system extracts one or more features of target 106 in the capturedimage. In some embodiments, target information 408, such as target type410 and/or tracked target information 412, may be identified by imageanalysis module(s) (e.g., image analysis module 406 and/or similar imageanalysis modules at memory 504 and/or memory 604) based on the extractedfeatures. In some embodiments, target information 408 may be obtainedfrom memory 118 of movable object 102, memory 504 of control unit 104,and/or memory 604 of computing device 126.

In some embodiments, image analysis module(s), such as image analysismodule 406 and/or similar image analysis modules at memory 504 and/ormemory 604, perform image recognition or identification techniques toidentify target 106 based on extracted target information 408. In someembodiments, the system may identify a type (e.g., or a category) oftarget 106. For example, target 106 may be identified to be a movingobject or a still object. Target 106 may be identified to be a human, ananimal, a car, a ship, or any other suitable object type. In someembodiments, image analysis module(s) use neural network to performimage recognition and/or classification of object(s) included in theimage. In some embodiments, the system performs target classificationautomatically. In some embodiments, one or more steps of targetclassification may be performed manually by the user. In one example,the user may use input device to indicate a type of target 106. Inanother example, the system may present more than one target typecandidate identified via image analysis to the user, and the user mayselect a certain type to be associated with target 106.

The system determines (706) whether target 106 is a predeterminedrecognizable target (e.g., PRT) type as previously stored. For example,one or more predetermined recognizable target types and thecorresponding characteristics are stored at memory 118 of movable object102, memory 504 of control unit 104, and/or memory 604 of computingdevice 126 as discussed elsewhere herein. In some embodiments, imageanalysis module(s), such as image analysis module 406 and/or similarimage analysis modules at memory 504 and/or memory 604, compare theextracted features with one or more characteristics associated with apredetermined individual type of predetermined recognizable target typeinformation 414. The comparison result between the one or more featuresof target 106 and the one or more characteristics of a certainpredetermined recognizable target type may be assigned a matching score.Target 106 may be identified to be a certain predetermined recognizabletarget type, e.g., a human target, based on a highest matching score. Inone example, the pattern (or shape), the height, and/or the speed oftarget 106 may be determined to be similar to a human pattern, within aheight range, and within a speed range of a human target type stored atpredetermined recognizable target type information 414. Target 106 isthus determined to be a human target as a predetermined recognizabletarget type.

When the system determines that target 106 is a certain predeterminedrecognizable target type, the system determines (708) whether adimension of target 106 satisfies requirements for targeting trackingassociated with a predetermined recognizable target type. For example,the system compares the dimension of target 106 with a firstpredetermined threshold value. When the dimension of target 106 is lowerthan the first predetermined threshold value, the system may not be ableto obtain accurate or sufficient target information to perform automatictarget tracking accurately. In some embodiments, when the dimension oftarget 106 is greater than a second predetermined threshold value (whichis larger than the first predetermined threshold value), the system maynot be able to obtain accurate target information to perform automatictarget tracking accurately due to the large target dimension. In someembodiments, the system determines a length, a height, a width, athickness, a diameter, an area, and/or any other suitable dimensionalfactor of target 106. In some embodiments, the system determines thedimension of target 106 using pixel information as discussed withreference to FIG. 4. In some embodiments, the predetermined thresholdmay be a predetermined minimum number of pixels on a captured image. Insome embodiments, the predetermined threshold value may or may not bedifferent for different types of target as identified in step 706.

When the system determines that the dimension of target 106 in thecaptured image is appropriate for automatic target tracking, the systementers a mode 710 for initiating target tracking associated with theidentified predetermined recognizable target (PRT) type (e.g., using PRTmode 710). For example, one or more characteristics associated with theidentified predetermined recognizable target type are used during theinitiation process for target tracking and/or the target trackingprocess.

In some alternative embodiments, step 708 is optional in method 700. Forinstance, when the system determines (706) that target 106 is a certainpredetermined recognizable target type, the system enters the PRT mode710. The system initiates the target tracking features associated withthe identified predetermined recognizable target type. For example, thesystem uses one or more characteristics (e.g., target size and/or targetspeed of predetermined recognizable target type information 414)associated with the identified predetermined recognizable target type inthe initiation process for target tracking and/or the target trackingprocess.

In PRT mode 710, the system determines (712) a spatial relationshipbetween target 106 and movable object 102. In some embodiments, thespatial relationship is determined using one or more characteristics ofthe identified predetermined recognizable target type. For example, whentarget 106 is identified to be a human target, the system associates thetarget with a human with an average height of 1.7 meters based onpredetermined recognizable target type information 414. Then based onthe number of pixels on the captured image along a height dimension ofthe human target of about 1.7 meters high, the systems knows a realworld size each pixel corresponds to. This information can be used forcalculating and/or verifying the spatial relationship between target 106and movable object 102. In some embodiments, the spatial relationshipincludes a horizontal distance between target 106 and movable object102. In some embodiments, the spatial relationship includes a pitchangle to indicate a relative position relationship between target 106and movable object 102. The pitch angle may be determined using a pitchangle of a gimbal borne by movable object 102 for carrying imagingdevice 214, and a target pitch angle of target 106 displayed on thecaptured image. In some embodiments, the spatial relationship may alsoinclude a height of movable object 102.

The system determines (714) whether the spatial relationship betweentarget 106 and movable object 102 is appropriate for target tracking.

In one example, when movable object 102 is too high, e.g., higher thanabout 50 meters, the system may not be able to obtain sufficient pixelinformation of target 106 on the ground. But when movable object 102 istoo low, e.g., lower than 2 meters, there could be safety concerns. Thusthe system may maintain a suitable height range for target tracking. Insome embodiments, the suitable horizontal distance range is determinedbased on the height of movable object 102. In some embodiments, thehigher the movable object, the broader the suitable horizontal distancerange. For example, when the height of movable object 102 is about 3meters, the allowed horizontal distance range is from about 3 meters toabout 10 meters. When the height of movable object 102 is about 20meters, the allowed horizontal distance range is from about 0 meters toabout 30 meters.

In another example, when the pitch angle is lower than a predeterminedthreshold value (e.g., about −40°), movable object 102 may not be ableto obtain sufficient features of target in the captured image, thustarget information may not be accurate for target tracking. To avoid so,the system may maintain a threshold value for the pitch angle. The oneor more spatial relationship factors (e.g., horizontal distance, pitchangle, height, etc.) may be determined independently or in combinationto determine whether the spatial relationship is sufficient at step 714.

In yet another example, when the horizontal distance between target 106and movable object 102 is too large, the system may not be able toobtain sufficient pixel information of target 106 for target tracking.Nor should the horizontal distance be too small for safety concern. Thusthe system may maintain a suitable horizontal distance range (e.g., asafe distance range or an allowed horizontal distance range).

When one or more spatial relationship factors are determined to beappropriate for target tracking, the system allows (716) the user toconfirm initiating automatic target tracking in PRT mode. In someembodiments, the system sends a notification on a display device, suchas display 508 of control unit 104 and/or display device 616 ofcomputing device 126. The user of the display device can respond to thenotification by confirming whether or not to enter the automatic targettracking mode through the display device 616, which generates a responseand returns the response to the system. For example, the user maygenerate the response by tapping a touch screen, clicking a mouse, orusing any other suitable interaction user input method to confirminitiating automatic target tracking.

Upon receipt of the response, the system determines (718) whether theuser confirms to start automatic target tracking. When the systemreceives the user's confirmation to start automatic target tracking, thesystem terminates (720) the initiation process for automatic targettracking. The system proceeds to automatically track target 106 as thepredetermined recognizable target type in PRT mode.

When the system does not receive user confirmation to start automatictarget tracking, the system refines (722) control gain to keep trackingtarget 106 and capturing one or more subsequent images by imaging device214. In some embodiments, the system (motion control module and/ortracking module located at movable object 102, control unit 104, and/orcomputing device 126) adjusts one or more control parameters of movableobject 102 and/or carrier 108 based on the determined spatialrelationship (e.g., the horizontal distance and/or the pitch angle)between target 106 and movable object 102. In some embodiments, the oneor more control parameters may be adjusted based on one or morecharacteristics associated with the identified predeterminedrecognizable target type. For example, when target 106 is determined tobe a human target, the speed of movable object 102 may be adjusted to bein sync with the moving speed of the human target.

In some embodiments, adjusting the one or more control parametersincludes an adjustment of an orientation, position, attitude, and/or oneor more movement characteristics of movable object 102, carrier 108,and/or payload 110. In some embodiments, instructions are generated tosubstantially changing a control parameter of imaging device 214 and/orone or more sensors of movable object sensing system 122, e.g., changingzoom, focus, or other characteristics associated with imaging device214. In some embodiments, the instructions to refine control gain aregenerated using information from image data in combination with sensingdata acquired by one or more sensors of movable object sensing system122 (e.g., proximity sensor and/or GPS sensor) and/or positioninformation transmitted by target 106 (e.g., GPS location).

In some embodiments, the system refines control gain by adjusting a zoomlevel of imaging device 214 (assuming that the imaging device supportsthe zoom level required), by adjusting one or more movementcharacteristics of movable object 102, or by a combination of adjustinga zoom level of imaging device 214 and adjusting one or more movementcharacteristics of movable object 102. In some embodiments, a controlapplication (e.g., control module 402, tracking module 404 and/or acontrol application of control unit 104 and/or computing device 126)determines one or more adjustments. For example, if the imaging device214 does not support a zoom level required to substantially track target106, one or more movement characteristics of movable object 102 areadjusted instead of or in addition to adjusting the zoom level ofimaging device 214.

In some embodiments, the adjustment of the orientation, position,attitude, one or more movement characteristics, and/or another operationparameter of movable object 102, carrier 108, and/or payload 110 islimited by one or more constraints imposed by system configuration 400(e.g., as configured by a manufacturer, administrator, or user), bycontrol unit 104 (e.g., user control input received at control unit104), and/or by computing device 126. Examples of constraints includelimits (e.g., maximum and/or minimum limits) for a rotation angle,angular velocity, and/or linear velocity along one or more axes. Forexample, the angular velocity of movable object 102, carrier 108, and/orpayload 110 around an axis is constrained by, e.g., a maximum angularvelocity that is allowed for movable object 102, carrier 108, and/orpayload 110. In some embodiments, the linear velocity of movable object102, carrier 108, and/or payload 110 is constrained by, e.g., a maximumlinear velocity that is allowed for movable object 102, carrier 108,and/or payload 110. In some embodiments, adjustment of the focal lengthof imaging device 214 is constrained by a maximum and/or minimum focallength for imaging device 214.

When one or more spatial relationship factors (e.g., horizontaldistance, pitch angle, height, etc.) are determined to be inappropriatefor target tracking, the system does not start automatic targettracking. In some embodiments, the system sends (724) a warningindicator. The warning indicator includes text, audio (e.g., siren orbeeping sound), images or other visual indicators (e.g., changed userinterface background color and/or flashing light), and/or hapticfeedback. A warning indicator is provided at, e.g., movable object 102,control unit 104, and/or computing device 126. For example, the warningindicator includes a text box showing “Target too close” or “Target tooremote” to the user.

After the system refines (722) control gain or after the system sends(724) the warning indicator, the system obtains (760) a next imagecontaining the target using imaging device 214. In some embodiments, thenext image is an image captured after a certain period of time (e.g.,0.01 second, 0.1 second, 0.2 second, 0.5 second, or 1 second). In someembodiments, the next image is a subsequent image frame of a video, suchas an immediately subsequent image frame, or an image frame aftercertain number of frames. In some embodiments, the next image containsthe same target as target 106 in the previous image. In someembodiments, the next image contains a target different from the targetcontained in the previous image.

The system then determines (762) whether target 106 in the current imageis a predetermined recognizable target type. In some embodiments, target106 is determined to be the same recognizable target type as previouslyidentified. In some embodiments, due to different target informationcaptured in the current image, target 106 may be determined to be adifferent predetermined recognizable target type. Because position ofmovable object 102 and/or lens configuration of imaging device 214 maychange from last image, the system proceeds to determine (708) thedimension of target 106 and to calculate (712) spatial relationshipbetween target 106 and movable object 102. In some embodiments, whenstep 708 is not included in method 700, the system proceeds to enter PRTmode 710 to calculate (712) a spatial relationship 712 between target106 and movable object 102.

In some embodiments, features of target 106 in the current image may notmatch with any characteristics of the predetermined recognizable targettype information 414, the system associates target 106 with a generictarget type. The system exits the PRT mode 710 and enters the GT mode740 as discussed below.

In some embodiments, at step 706 or 762, when target 106 is notdetermined to be any predetermined recognizable target type (e.g.,determined to not match any predetermined recognizable target type), thesystem enters a mode (740) for initiating target tracking associatedwith a generic target (GT) type (e.g., GT mode 740). In some embodimentswhere method 700 includes step 708, at step 708, when the systemdetermines that the dimension of target 106 in the captured image is notappropriate for initiating automatic target tracking for thepredetermined recognizable target, the system also enters GT mode 740.

In GT mode 740, the system determines (742) a spatial relationshipbetween target 106 and movable object 102. In some embodiments, thespatial relationship includes one or more factors such as a horizontaldistance between target 106 and movable object 102, a height of movableobject 102, and a pitch angle. In some embodiments, the horizontaldistance between target 106 and movable object 102 is determined using atriangulation method. For example, the horizontal distance may becalculated using a height of movable object 102 and a pitch anglebetween movable object 102 and target 106. The pitch angle may bedetermined using pitch movements of movable object 102, carrier 108,and/or payload 110. In some embodiments, the system determines a safedistance range based on the height of movable object 102. For example,when the height of movable object 102 is about 3 meters, the allowedhorizontal distance range is from about 3 meters to about 10 meters.When the height of movable object 102 is about 20 meters, the allowedhorizontal distance range is from about 0 meters to about 30 meters.

The system determines (744) whether the horizontal distance is withinthe determined safe distance range. When one or more spatialrelationship factors are determined to be appropriate for targettracking, the system allows (746) the user to confirm initiatingautomatic target tracking in the GT mode. In some embodiments, thesystem sends a notification on a display device, such as display 508 ofcontrol unit 104 and/or display device 616 of computing device 126. Theuser then responds in a similar manner as described above in the PRTmode. For example, the user may tap a touch screen, click a mouse, oruse any other suitable interaction user input method to confirminitiating automatic target tracking.

Upon receipt of the response, the system determines (748) whether theuser confirms to start automatic target tracking. When the systemreceives the user's confirmation to start automatic target tracking, thesystem terminates (750) the initiation process for target tracking. Thesystem proceeds to automatically track target 106 as a generic targettype in GT mode.

When the system does not receive user confirmation to start automatictarget tracking, the system refines (752) control gain to keep trackingtarget 106 and capturing one or more subsequent images by imaging device214. In some embodiments, the system (motion control module and/ortracking module located at movable object 102, control unit 104, and/orcomputing device 126) adjusts one or more control parameters of movableobject 102 and/or carrier 108 based on the determined spatialrelationship (e.g., the horizontal distance and/or the pitch angle)between target 106 and movable object 102.

In some embodiments, adjusting the one or more control parametersincludes an adjustment of an orientation, position, attitude, and/or oneor more movement characteristics of movable object 102, carrier 108,and/or payload 110. In some embodiments, instructions are generated tosubstantially changing a control parameter of imaging device 214 and/orone or more sensors of movable object sensing system 122, e.g., changingzoom, focus, or other characteristics associated with imaging device214. In some embodiments, the instructions to refine control gain aregenerated using information from image data in combination with sensingdata acquired by one or more sensors of movable object sensing system122 (e.g., proximity sensor and/or GPS sensor) and/or positioninformation transmitted by target 106 (e.g., GPS location).

In some embodiments, the system refines control gain by adjusting a zoomlevel of imaging device 214 (e.g., if the imaging device supports thezoom level required), by adjustment of one or more movementcharacteristics of movable object 102, or by a combination of adjustinga zoom level of imaging device 214 and adjustment of one or moremovement characteristics of movable object 102. In some embodiments, acontrol application (e.g., control module 402, tracking module 404and/or a control application of control unit 104 and/or computing device126) determines one or more adjustments. For example, if the imagingdevice 214 does not support a zoom level required to substantially tracktarget 106, one or more movement characteristics of movable object 102are adjusted instead of or in addition to adjusting the zoom level ofimaging device 214.

As discussed elsewhere herein, in some embodiments, the adjustment ofthe orientation, position, attitude, one or more movementcharacteristics, and/or another operation parameter of movable object102, carrier 108, and/or payload 110 is limited by one or moreconstraints imposed by system configuration 400 (e.g., as configured bya manufacturer, administrator, or user), by control unit 104 (e.g., usercontrol input received at control unit 104), and/or by computing device126.

When one or more spatial relationship factors (e.g., horizontaldistance, pitch angle, height, etc.) are determined to be insufficientfor target tracking, the system does not allow the user to startautomatic target tracking. The system also sends (754) a warningindicator. In some embodiments, a warning indicator includes text, audio(e.g., siren or beeping sound), images or other visual indicators (e.g.,changed user interface background color and/or flashing light), and/orhaptic feedback. A warning indicator is provided at, e.g., movableobject 102, control unit 104, and/or computing device 126. In oneexample, the warning indicator includes a text box showing “Target toofar” or “Target to close” to the user.

After the system refines (752) control gain or after the system sends(754) the warning indicator, the system obtains (760) a next imagecontaining a target using imaging device 214 as discussed elsewhereherein. The system then determines (762) whether the target (e.g.,target 106) in the current image is a predetermined recognizable targettype or not.

When target 106 in the current image is determined to be a predeterminedrecognizable target type, the system proceeds to determine (708) thedimension of target 106 and to calculate (712) spatial relationshipbetween target 106 and movable object 102. When target 106 in thecurrent image does not belong to any predetermined recognizable targettype, the system associates target 106 with a generic target type andproceeds with GT mode 740 as discussed above.

FIG. 8 illustrates an exemplary configuration 800 of a movable object102, carrier 108, and payload 110, in accordance with embodiments. Theconfiguration 800 is used to illustrate exemplary adjustments to anorientation, position, attitude, and/or one or more movementcharacteristics of movable object 102, carrier 108, and/or payload 110,e.g., as used to perform initialization of target tracking and/or totrack target 106.

In some embodiments, movable object 102 rotates around up to threeorthogonal axes, such as X₁ (pitch) 810, Y₁ (yaw) 808 and Z₁ (roll) 812axes. Rotations around the three axes are referred to herein as pitchrotation 822, yaw rotation 820, and roll rotation 824, respectively.Angular velocities of movable object 102 around the X₁, Y₁, and Z₁ axesare referred to herein as ω_(X1), ω_(Y1), and ω_(Z1), respectively. Insome embodiments, movable object 102 engages in translational movements828, 826, and 830 along the X₁, Y₁, and Z₁ axes, respectively. Linearvelocities of movable object 102 along the X₁, Y₁, and Z₁ axes arereferred to herein as V_(X1), V_(Y1), and V_(Z1), respectively.

In some embodiments, payload 110 is coupled to movable object 102 viacarrier 108. In some embodiments, payload 110 moves relative to movableobject 102 (e.g., payload 110 is caused by actuator 204 of carrier 108to move relative to movable object 102).

In some embodiments, payload 110 moves around and/or along up to threeorthogonal axes, X₂ (pitch) 816, Y₂ (yaw) 814 and Z₂ (roll) 818. The X₂,Y₂, and Z₂ axes are respectively parallel to the X₁, Y₁, and Z₁ axes. Insome embodiments, where payload 110 includes imaging device 214 (e.g.,including an optical module 802), the roll axis Z₂ 818 is substantiallyparallel to an optical path or optical axis for optical module 802. Insome embodiments, optical module 802 is optically coupled to imagesensor 216 (and/or one or more sensors of movable object sensing system122). In some embodiments, carrier 108 causes payload 110 to rotatearound up to three orthogonal axes, X₂ (pitch) 816, Y₂ (yaw) 814 and Z₂(roll) 818, e.g., based on control instructions provided to actuator 204of carrier 108. The rotations around the three axes are referred toherein as the pitch rotation 834, yaw rotation 832, and roll rotation836, respectively. The angular velocities of payload 110 around the X₂,Y₂, and Z₂ axes are referred to herein as ω_(X2), ω_(Y2), and ω_(Z2),respectively. In some embodiments, carrier 108 causes payload 110 toengage in translational movements 840, 838, and 842, along the X2, Y2,and Z2 axes, respectively, relative to movable object 102. The linearvelocity of payload 110 along the X2, Y2, and Z2 axes is referred toherein as V_(X2), V_(Y2), and V_(Z2), respectively.

In some embodiments, the movement of payload 110 may be restricted(e.g., carrier 108 restricts movement of payload 110, e.g., byconstricting movement of actuator 204 and/or by lacking an actuatorcapable of causing a particular movement).

In some embodiments, the movement of payload 110 may be restricted tomovement around and/or along a subset of the three axes X₂, Y₂, and Z₂relative to movable object 102. For example, payload 110 is rotatablearound X₂, Y₂, Z₂ (movements 832, 834, 836) or any combination thereof,payload 110 is not movable along any of the axes (e.g., carrier 108 doesnot permit payload 110 to engage in movements 838, 840, 842). In someembodiments, payload 110 is restricted to rotation around one of the X₂,Y₂, and Z₂ axes. For example, payload 110 is only rotatable about the Y₂axis (e.g., rotation 832). In some embodiments, payload 110 isrestricted to rotation around only two of the X₂, Y₂, and Z₂ axes. Insome embodiments, payload 110 is rotatable around all three of the X₂,Y₂, and Z₂ axes.

In some embodiments, payload 110 is restricted to movement along X₂, Y₂,or Z₂ axis (movements 838, 840, 842), or any combination thereof, andpayload 110 is not rotatable around any of the axes (e.g., carrier 108does not permit payload 110 to engage in movements 832, 834, 836). Insome embodiments, payload 110 is restricted to movement along only oneof the X₂, Y₂, and Z₂ axes. For example, movement of payload 110 isrestricted to movement 840 along the X₂ axis). In some embodiments,payload 110 is restricted to movement along only two of the X₂, Y₂, andZ₂ axes. In some embodiments, payload 110 is movable along all three ofthe X₂, Y₂, and Z₂ axes.

In some embodiments, payload 110 is able to perform both rotational andtranslational movement relative to movable object 102. For example,payload 110 is able to move along and/or rotate around one, two, orthree of the X₂, Y₂, and Z₂ axes.

In some embodiments, payload 110 is coupled to movable object 102directly without a carrier 108 or carrier 108 does not permit payload110 to move relative to movable object 102. In some embodiments, theattitude, position and/or orientation of payload 110 is fixed relativeto movable object 102 in such cases.

In some embodiments, adjustment of attitude, orientation, and/orposition of payload 110 is performed by adjustment of movable object102, carrier 108, and/or payload 110, such as an adjustment of acombination of two or more of movable object 102, carrier 108, and/orpayload 110. For example, a rotation of 60 degrees around a given axis(e.g., yaw axis) for the payload is achieved by a 60-degree rotation bymovable object alone, a 60-degree rotation by the payload relative tomovable object as effectuated by the carrier, or a combination of40-degree rotation by movable object and a 20-degree rotation by thepayload relative to movable object.

In some embodiments, a translational movement for the payload isachieved via adjustment of movable object 102, carrier 108, and/orpayload 110 such as an adjustment of a combination of two or more ofmovable object 102, carrier 108, and/or payload 110. In someembodiments, a desired adjustment is achieved by adjustment of anoperational parameter of the payload, such as an adjustment of a zoomlevel or a focal length of imaging device 214.

FIG. 9A illustrates an exemplary initialization process for trackingtarget 106, in accordance with some embodiments. FIGS. 9B-9C illustratean image 952 containing target 106 displayed on a user interface 950, inaccordance with embodiments. In some embodiments, imaging device 214borne by payload 110 of movable object 102 captures image 952 asdisplayed in FIGS. 9B-9C. In some embodiments, user interface 950 may beused for selecting and/or initializing tracking of target 106. In someembodiments, user interface 950 is displayed by a control unit 104and/or a computing device 126. In some embodiments, the user interfaceis displayed by display 508 of control terminal 104. Image 952 on userinterface 950 may include one or more objects (not shown) captured byimaging device 214 in addition to target 106.

In some embodiments, control unit 104 and/or computer device 126 includeone or more input devices 506 for receiving user input. In someembodiments, input received by input devices 506 is used to provideinput indicating a user interest in target 106 with which graphicalselection indicator 954 is to be associated. In this way, a userindicates a target 106 to be tracked, in accordance with someembodiments. In some embodiments, user input received at input device506 to associate a graphical selection indicator 954 with target 106includes an input gesture received at a point that corresponds to target106. In some embodiments, an input gesture is provided by a contact(e.g., by a finger and/or stylus) at display 508 (e.g., a touchscreendisplay). In some embodiments, a selection of target 106 is provided byuser-manipulated input device 506 (such as a mouse, button, joystick,keyboard, etc.).

As shown in FIG. 9C, a graphical tracking indicator 955 is shown to beassociated with target 106. In some embodiments, graphical trackingindicator 955 may be identical with graphical selection indicator 954.In some embodiments, graphical tracking indicator 955 may be generatedby the system based on graphical selection indicator 954. For example,graphical tracking indicator 955 may be a bounding box generated by thesystem as a regular shaped box for enclosing target 106. In someembodiments, target information 408 is generated based on received input(e.g., associated with graphical selection indicator 954) and/orgraphical tracking indicator 955 associated with target 106. In someembodiments, the position of graphical tracking indicator 955 changes asthe position of target 106 changes, e.g., such that graphical trackingindicator 955 remains associated with (e.g., adjacent to or surrounding)tracked target 106.

In some embodiments as discussed in method 700, the system comparesextracted features of target information 408 with one or morecharacteristics associated with one or more predetermined recognizabletarget types. For example as shown in FIG. 9C, target 106 may beidentified to be a human target. In some embodiments, the systemdisplays an indication box 960 to notify the user that target 106 isidentified as a predetermined recognizable target type.

FIG. 10A illustrates an exemplary initialization process for trackingtarget 106, in accordance with some embodiments. FIG. 10B illustrates animage 1052 containing target 106 displayed on a user interface 1050, inaccordance with embodiments. As discussed with reference to method 700of FIG. 7, in some embodiments, the system determines a dimension oftarget using a number of pixels included in image 1052. For example, thesystem determines a height of a bounding box 1055 (box_h) using a numberof pixels along a height dimension. In some embodiments, when the heightbox_h is determined to be smaller than a predetermined threshold value,the system displays a warning indicator 1060 to notify the user thattarget 106 is too small in image 1052. The system may switch to GT mode740 as discussed with reference to FIG. 7.

FIG. 11 illustrates an exemplary method for determining a pitch anglebetween target 106 and movable object 102, in accordance with someembodiments. In some embodiments, the pitch angle is determined based ona pitch angle (a) of a carrier borne by movable object 102 and a targetpitch angle (β) on the captured image. For example, as shown in FIG. 11,the pitch angle α indicates a pitch angle of the current image center(e.g., camera center, or optical axis of imaging device 214) relative tothe horizontal level. In some embodiments, the pitch angle α isdetermined based on a pitch angle of a carrier (e.g., a gimbal). In someembodiments, the pitch angle α is determined based on a combination of apitch angle of payload, a pitch angle of a gimbal, and/or a pitch angleof movable object 102.

In some embodiments, the target pitch angle β is determined based on anumber of pixels related to a height (h) which extends from the centerof the image to the bottom of target 106. For example, in FIG. 11, theheight h extends from the center of the image to the ground (e.g.,assume that the human target is standing on the ground).

FIG. 12 illustrates an exemplary method for determining a pitch angle ofa target 106, in accordance with embodiments. Assume that an image 1200has a width of W pixels and a height of H pixels (where W and H arepositive integers). A position within the image is defined by a pair ofcoordinates along the width of the image and along the height of theimage, where the upper left corner of image has coordinates (0, 0) andthe lower right corner of the image has coordinates (W, H). A centerpixel P has a pair of coordinates of (u₀,v₀), where u₀=W/2, and/orv₀=H/2. A pixel B near the feet of the human target has a pair ofcoordinates of (u₁,v₁). The height (h) between the center of the imageand the bottom of target 106 as discussed in FIG. 11 can be calculatedas |v₁−v₀|. Assume that image 1200 covers a degree range (γ₁) along thewidth of the image, and a degree range (γ₂) along the height of theimage, a degree per pixel (θ) in image 1200 can be determined byθ=γ₁/W=γ₂/H. For example, when image 1200 covers 81.9281° along thewidth of the image, and 46.0846° along the height of the image, and hasa resolution of 640*360, a degree per pixel (θ) is calculated to be0.1280127°.

Referring back to FIG. 11, the target pitch angle θ can be calculated asβ=h×θ. Thus the pitch angle between target 106 and movable object 102can be determined by a sum of the gimbal pitch angle α and the targetpitch angle β. As discussed in method 700 of FIG. 7, in someembodiments, the system compares the calculated pitch angle betweentarget 106 and movable object 102 with a predetermined threshold value(e.g., −40°). When the calculated pitch angle between target 106 andmovable object 102 is smaller than the predetermined threshold value,for example, when the calculated pitch angle between target 106 andmovable object 102 is calculated to be −60°, the system may send awarning indicator (e.g., a visual or an audio indicator) to notify theuser that pitch angle is not suitable for automatic target tracking.

FIG. 13A illustrates an initialization process for tracking target 106,in accordance with some embodiments. FIG. 13B illustrates an image 1352containing target 106 displayed on a user interface 1350, in accordancewith embodiments. In some embodiments when the pitch angle betweentarget 106 and movable object 102 is smaller than the predeterminedthreshold value, movable object 102 hovers on top of target 106 as shownin FIG. 13A. Imaging device 214 may not be able to capture sufficientinformation of target 106 from this pitch angle. For example, target 106shown in image 1352 of FIG. 13B may demonstrate different features(e.g., shape, pattern, or size) from the characteristics of apredetermined recognizable target (e.g., a human target). So the systemdisplays a warning indicator 1360 to notify the user, for example, bydisplaying a text “Warning: Pitch angle too low” on user interface 1350.

FIG. 14 illustrates an exemplary method for determining a horizontaldistance between a predetermined recognizable target and movable object102, in accordance with embodiments. In some embodiments, movable object102 is at a similar height with a height of target 106. In someembodiments, when target 106 is identified to be a predeterminedrecognizable target (e.g., a human target), a horizontal distance (d)between target 106 and movable object 102 using one or morecharacteristics associated with the predetermined recognizable target,such as a height of the human target (target_h). For instance, afteridentifying target 106 to be a human target, a height of 1.7 metersbased on preset characteristics of predetermined recognizable targettype information 414 is assigned to target 106. The distance d betweentarget 106 and movable object 102 can be expressed as:

$d = \frac{target\_ h}{2*{\tan \left( {{box\_ h}*\theta} \right)}}$

where target_h is a characteristic of a predetermined recognizabletarget preset by predetermined recognizable target type information 414(e.g., an average height of a human), box_h is a height of a boundingbox enclosing target 106 (which approximates to the height of target 106displayed in the current image), and 0 is the degree per pixel of in thecurrent image as discussed in FIG. 12. In some embodiments as discussedin method 700 of FIG. 7, a safe distance range is determined based on aheight of movable object 102. The system compares the calculateddistance d with the safe distance range, and sends a warning indicatorto the user when the distance d is not within the safe distance range.

FIG. 15 illustrates an exemplary method for determining a horizontaldistance between a generic target and movable object 102, in accordancewith embodiments. In some embodiments, when target 106 is identified tobe a generic target, the system determines a horizontal distance (d)between target 106 and movable object 102 using a triangulation method.For instance, the distance d between generic target 106 and movableobject 102 can be expressed as:

$d = \frac{H}{\tan (\alpha)}$

where H is a height of movable object 102, and a is a pitch angle of acarrier of imaging device 214, such as a gimbal.

FIGS. 16A-16F are a flow diagram illustrating a method 1600 for trackinga movable object, in accordance with some embodiments. The method 1600is performed at a system, such as one or more devices including movingobject 102, control unit 104 and/or computing device 126. For example,instructions for performing the method 1600 are stored in motion controlmodule 402 of memory 118 and executed by processor(s) 116. In someembodiments, the computing functionalities discussed herein areperformed at movable object 102, at the ground controller (such ascontrol unit 104 and/or computing device 126), or at a combination ofcertain computing functionalities contained in both movable object 102and the ground controller. In some embodiments, one or more steps ofmethod 1600 are performed at the ground controller, and one or moreother steps of method 1600 are performed at movable object 102.

The system obtains (1602) a first image frame captured by imaging device214 borne by movable object 102. The first image frame contains target106.

The system extracts (1604) one or more features of target 106 from thefirst image frame. Target 106 is within a region selected by a user onthe first image frame. The one or more features of the target objectcomprise (1652) one or more dimensional features displayed on the firstimage frame. The one or more dimensional features include a shape, apattern, a length, a width, and/or a height of the target object. Thesystem generates (1654) a bounding box based on the one or moredimensional features of target 106 for defining target 106 on the firstimage frame. The system obtains (1656) a second image frame includingtarget 106 captured by imaging device 214. The one or more features oftarget 106 further comprise a speed and an acceleration of the target106 calculated based on the one or more dimensional features of thetarget 106. In some embodiments, the second image frame is a subsequentimage frame of the first image frame. In some embodiments, the secondimage frame is an image captured after the first image frame after acertain period of time.

The system determines (1606) whether target 106 is a predeterminedrecognizable object type based on a comparison of the extracted one ormore features with one or more characteristics associated with thepredetermined recognizable object type. In some embodiments, the systemperforms image recognition and/or an object classification to identifytarget 106.

In accordance with a determination that target 106 is a predeterminedrecognizable object type (e.g., a human target), the system initiates(1608) tracking functions provided in the system and associated with thepredetermined recognizable object type. In accordance with adetermination that the target 106 does not belong to any predeterminedrecognizable object type, the system initiates (1610) tracking functionsprovided in the computing system and associated with a general objecttype.

When target 106 is determined to be a predetermined recognizable objecttype, the tracking functions associated with the predeterminedrecognizable object type include (1612) adjusting one or more controlparameters that control one or more selected from a spatial relationshipbetween movable object 102 and target 106, a movement of movable object102, and a movement of carrier 108 (e.g., a gimbal) borne by movableobject 102. In some embodiments, the system adjusts control parametersand spatial relationship between movable object 102 and target 106. Insome embodiments, the system enables certain functional modules thatallow certain advanced controls. For instance, the system enablescontrolling flying pattern of movable object 102 and/or adjustingposition of carrier 108 using human gestures.

In some embodiments, when target 106 is identified to be a human, thesystem recognizes (1638) one or more body gestures of the human target.For example, the system can recognize a hand wave, a finger gesture, andany other body gestures of the human target. In some embodiments, thesystem adjusts (1640) one or more control parameters of movable object102 in accordance with the one or more body gestures of the humantarget. In some embodiments, when target 106 is identified to be ahuman, the system performs (1642) a facial recognition of the humantarget to retrieve one or more facial features of the human target. Insome embodiments, when movable object 102 loses the human target in thesubsequent captured images, the facial features of the human target maybe used to find the previous human target and to avoid identifying awrong human target. In some embodiments, when the target 106 isidentified to be a human, the system performs (1644) machine learning toobtain one or more personal features of the human target. The obtainedpersonal features can be used for automatically tracking the humantarget by movable object 102.

In some embodiments, the one or more control parameters are generated(1614) in accordance with one or more characteristics associated withthe predetermined recognizable object type. For example, acharacteristic (e.g., speed, height, etc.) are used for generating theone or more control parameters. In some embodiments, the one or morecontrol parameters comprise (1616) a yaw angle movement of movableobject 102, a translational movement of movable object 102, a velocityof movable object 102, and an acceleration of movable object 102. Insome embodiments, the translational movement of movable object 102comprises a horizontal movement of movable object 102 and/or a verticalmovement of movable object 102.

In some embodiments, imaging device 214 is coupled (1618) to carrier 108(e.g., a gimbal) borne by movable object 102. The one or more controlparameters further comprise (1618) a yaw angle movement of the gimbaland/or a pitch angle movement of the gimbal.

In some embodiments, the system determines (1620) the spatialrelationship between movable object 102 and target 106 using at leastthe one or more characteristics associated with the predeterminedrecognizable object type.

In some embodiments, prior to determining the spatial relationshipbetween movable object 102 and target 106, the system determines (1622)whether a dimension of the target 102 displayed in the first image frameis above a predetermined threshold value. In some embodiments, thedimension of target 102 is determined by a number of pixels displayed onthe first image frame. In some embodiments, a minimum number of pixelsis preset as the threshold value that is sufficient for movable object102 to track target 106. In some embodiments, in accordance with adetermination that the dimension of target 106 is above or equal to thepredetermined threshold value, the system determines (1622) the spatialrelationship between movable object 102 and target 106. In someembodiments, in accordance with a determination that the dimension oftarget 106 is below the predetermined threshold value, the systeminitiates (1622) tracking functions associated with the general objecttype. In some embodiments, when the system determines that the dimensionof target 106 is below the predetermined threshold value, the systemswitches from PRT mode to GT mode as discussed in method 700 of FIG. 7.

In some embodiments, the spatial relationship between movable object 102and target 106 comprises (1624) a horizontal distance between movableobject 102 and target 106. In some embodiments, the spatial relationshipcomprises (1626) a pitch angle between movable object 102 and target106. The system further determines (1626) whether the pitch angle islower than a predetermined value (e.g., −40 degrees). In someembodiments, the pitch angle is determined (1628) using a pitch angle(e.g., pitch angle α of FIG. 11) of a gimbal borne by movable object 102for carrying the imaging device and a target pitch angle (e.g., pitchangle θ of FIG. 11) of the target 106 displayed on the first imageframe.

In accordance with a determination that the pitch angle is lower thanthe predetermined value, the system sends (1630) a warning indication tothe user (e.g., as shown in FIG. 13B). In some embodiments, the systemadjusts (1632), or allows (1632) the user to adjust, the one or morecontrol parameters of movable object 102 such that an updated pitchangle is equal to or above the predetermined value. The system obtains(1632) one or more image frames subsequent to the first image frame fordetermining the updated pitch angle.

In accordance with a determination that the pitch angle is higher orequal to the predetermining value, the system sends (1634) a request tothe user to confirm to initiate an automatic tracking mode. Forinstance, the system send a request to the user to start automaticallytrack the identified predetermined recognized target using the one ormore associated characteristics.

In accordance with a determination that the user does not confirm toinitiate the automatic tracking mode, the system obtains (1636) a secondimage frame including target 106. The system determines (1636) whethertarget 106 is a predetermined recognizable object type based on acomparison of extracted one or more features from the second image framewith one or more characteristics associated with the predeterminedrecognizable object type. In accordance with a determination that target106 does not belong to any predetermined recognizable object type, thesystem initiates (1636) the tracking options associated with the generalobject type. In some embodiments, before receiving user confirmation onthe first image frame, the system switches from PRT mode to GT mode whentarget features on the next frame do not match any predeterminedrecognizable target type.

In some embodiments, the system determines (1646) whether a height ofmovable object 102 relative to the ground is within a predeterminedheight range (e.g., a range from about 1 meter to about 50 meters). Whenthe movable object 102 is too high, target 106 may become too small toextract sufficient features for target tracking. Movable object 102cannot be too low for safety concern. In accordance with a determinationthat the height of movable object 102 is not within the predeterminedheight range, the system sends a warning signal to the user. The usermay manually adjust the height of movable object 102 to be within thepredetermined height range. The determination of height may occur at anystep of method 700.

In some embodiments, the system obtains (1648) the one or morecharacteristics associated with one or more predetermined recognizableobject types respectively. The one or more characteristics associatedwith one or more predetermined recognizable object types may be obtainedfrom user input, from image recognition, and/or from any informationavailable on computer network.

In some embodiments, the one or more characteristics associated with thepredetermined recognizable object type comprises (1650) a category, ashape, a pattern, a length range, a width range, a height range, a speedrange, and/or an acceleration range of one or more objects of thepredetermined recognizable object type.

In some embodiments, the system provides (1658) one or more candidaterecognizable object types for user selection. The one or more candidaterecognizable object types are identified based on the comparison betweenthe extracted one or more features and one or more characteristicsassociated with one or more predetermined recognizable object typesrespectively. In some embodiments, the system determines the features oftarget 106 match the characteristics of more than one predeterminedrecognizable object type. The system displays the identified more thanone predetermined recognizable object type to the user for userselection. The system receives (1658) a user input indicating the targetobject is the predetermined recognizable object type selected from theone or more candidate recognizable object types.

When target 106 is determined to be the generic target object type, thesystem determines (1660) a spatial relationship between target 106 andmovable object 102 using the one or more extracted features. In someembodiments, the spatial relationship between target 106 and movableobject 102 comprises (1662) a horizontal distance between target 106 andmovable object 102. The system determines (1662) whether the horizontaldistance is within a predetermined distance range.

In accordance with a determination that the horizontal distance iswithin the predetermined distance range, the system provides (1664) anoption to the user to confirm to initiate the automatic tracking mode.

In accordance with a determination that the user does not confirm toinitiate the automatic tracking mode, the system obtains (1666) a secondimage frame including target 106 subsequent to the first image frame.The system determines (1666) whether target 106 is a predeterminedrecognizable object type based on a comparison of extracted one or morefeatures from the second image frame with one or more characteristicsassociated with the predetermined recognizable object type. Inaccordance with a determination that target 106 is a predeterminedrecognizable object type, the system initiates (1666) the trackingoptions associated with the predetermined recognizable object type. Insome embodiments, before receiving user confirmation on the first imageframe, the system switches from GT mode to PRT mode when target on thenext frame is identified to be a predetermined recognizable target type.

In some embodiments, the horizontal distance range is determined (1670)in accordance with a height of movable object 102 relative to theground. For instance, a safe distance range increases as the height ofmovable object 102 increases. In accordance with a determination thatthe horizontal distance is not within the predetermined distance range,the system sends (1672) a warning indication to the user. In someembodiments, the system adjusts (1674), or allows (1674) the user toadjust, one or more control parameters of movable object 102 such thatan updated spatial relationship between movable object 102 and target106 becomes suitable for initiating the automatic tracking mode. Thesystem obtains (1674) one or more image frames subsequent to the firstimage frame for determining the updated spatial relationship betweenmovable object 102 and target 106.

In some embodiments, the tracking functions associated with thepredetermined recognizable object type include (1676) enabling one ormore functional modules that allow movable object 102 to be controlledby a user input. In some embodiments, the system receives a user inputindicating a user interest of the target 106 on the first image frame.In some embodiments, the user input is received from a device (e.g.,control device 104 and/or computer device 126) when user views the firstimage frame on the device. In some embodiments, the user input indicatesa boundary surrounding at least a part of target 106 displayed on thedevice. In some embodiments, the user input indicates a location oftarget 106 displayed on the device. In some embodiments, the user inputis a user gesture captured by imaging device 214 borne by movable object102.

In some embodiments, the system obtains (1678) a second image framecaptured by imaging device 214. The system determines whether target 106in the first image frame is still contained in the second image frame.In accordance with a determination that the target object is notincluded in the second image frame, the system identifies (1678) one ormore candidate target objects from the second image frame. The candidatetarget objects may be identified to belong to the same predeterminedrecognizable object type. The one or more candidate target objects areidentified based on comparison between one or more features extractedfrom the one or more candidate target objects respectively and one ormore characteristics associated with the predetermined recognizableobject type. The system determines (1678) whether the one or moreextracted features of the respective one or more candidate targetobjects fit a target object model generated based on the one or moreextracted features of the target object in the first image frame. Inaccordance with a determination that one or more extracted features of acandidate target object fits the target object model, the systeminitiates (1678) the tracking operations associated with the targetobject. The target object model may be generated using one or morecharacteristics associated with the predetermined recognizable targetobject type. In some embodiments, the system generates one or morecontrol parameters of movable object 102 to ensure target 106 is locatedat a center of one or more image frames subsequent to the first imageframe captured by imaging device 214.

Many features of the present disclosure can be performed in, using, orwith the assistance of hardware, software, firmware, or combinationsthereof. Consequently, features of the present disclosure may beimplemented using a processing system. Exemplary processing systems(e.g., processor(s) 116, controller 210, controller 218, processor(s)502 and/or processor(s) 602) include, without limitation, one or moregeneral purpose microprocessors (for example, single or multi-coreprocessors), application-specific integrated circuits,application-specific instruction-set processors, field-programmable gatearrays, graphics processing units, physics processing units, digitalsignal processing units, coprocessors, network processing units, audioprocessing units, encryption processing units, and the like.

Features of the present disclosure can be implemented in, using, or withthe assistance of a computer program product, such as a storage medium(media) or computer readable medium (media) having instructions storedthereon/in which can be used to program a processing system to performany of the features presented herein. The storage medium (e.g., (e.g.memory 118, 504, 604) can include, but is not limited to, any type ofdisk including floppy disks, optical discs, DVD, CD-ROMs, microdrive,and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs,DDR RAMs, flash memory devices, magnetic or optical cards, nanosystems(including molecular memory ICs), or any type of media or devicesuitable for storing instructions and/or data.

Stored on any one of the machine readable medium (media), features ofthe present disclosure can be incorporated in software and/or firmwarefor controlling the hardware of a processing system, and for enabling aprocessing system to interact with other mechanism utilizing the resultsof the present disclosure. Such software or firmware may include, but isnot limited to, application code, device drivers, operating systems andexecution environments/containers.

Communication systems as referred to herein (e.g., communication systems120, 510, 610) optionally communicate via wired and/or wirelesscommunication connections. For example, communication systems optionallyreceive and send RF signals, also called electromagnetic signals. RFcircuitry of the communication systems convert electrical signalsto/from electromagnetic signals and communicate with communicationsnetworks and other communications devices via the electromagneticsignals. RF circuitry optionally includes well-known circuitry forperforming these functions, including but not limited to an antennasystem, an RF transceiver, one or more amplifiers, a tuner, one or moreoscillators, a digital signal processor, a CODEC chipset, a subscriberidentity module (SIM) card, memory, and so forth. Communication systemsoptionally communicate with networks, such as the Internet, alsoreferred to as the World Wide Web (WWW), an intranet and/or a wirelessnetwork, such as a cellular telephone network, a wireless local areanetwork (LAN) and/or a metropolitan area network (MAN), and otherdevices by wireless communication. Wireless communication connectionsoptionally use any of a plurality of communications standards, protocolsand technologies, including but not limited to Global System for MobileCommunications (GSM), Enhanced Data GSM Environment (EDGE), high-speeddownlink packet access (HSDPA), high-speed uplink packet access (HSDPA),Evolution, Data-Only (EV-DO), HSPA, HSPA+, Dual-Cell HSPA (DC-HSPDA),long term evolution (LTE), near field communication (NFC), wideband codedivision multiple access (W-CDMA), code division multiple access (CDMA),time division multiple access (TDMA), Bluetooth, Wireless Fidelity(Wi-Fi) (e.g., IEEE 102.11a, IEEE 102.11ac, IEEE 102.11ax, IEEE 102.11b,IEEE 102.11g and/or IEEE 102.11n), voice over Internet Protocol (VoIP),Wi-MAX, a protocol for e-mail (e.g., Internet message access protocol(IMAP) and/or post office protocol (POP)), instant messaging (e.g.,extensible messaging and presence protocol (XMPP), Session InitiationProtocol for Instant Messaging and Presence Leveraging Extensions(SIMPLE), Instant Messaging and Presence Service (IMPS)), and/or ShortMessage Service (SMS), or any other suitable communication protocol,including communication protocols not yet developed as of the filingdate of this document.

While various embodiments of the present disclosure have been describedabove, it should be understood that they have been presented by way ofexample, and not limitation. It will be apparent to persons skilled inthe relevant art that various changes in form and detail can be madetherein without departing from the spirit and scope of the disclosure.

The present disclosure has been described above with the aid offunctional building blocks illustrating the performance of specifiedfunctions and relationships thereof. The boundaries of these functionalbuilding blocks have often been arbitrarily defined herein for theconvenience of the description. Alternate boundaries can be defined solong as the specified functions and relationships thereof areappropriately performed. Any such alternate boundaries are thus withinthe scope and spirit of the disclosure.

The terminology used in the description of the various describedembodiments herein is for the purpose of describing particularembodiments only and is not intended to be limiting. As used in thedescription of the various described embodiments and the appendedclaims, the singular forms “a,” “an,” and “the” are intended to includethe plural forms as well, unless the context clearly indicatesotherwise. It will also be understood that the term “and/or” as usedherein refers to and encompasses any and all possible combinations ofone or more of the associated listed items. It will be furtherunderstood that the terms “includes,” “including,” “comprises,” and/or“comprising,” when used in this specification, specify the presence ofstated features, integers, steps, operations, elements, and/orcomponents, but do not preclude the presence or addition of one or moreother features, integers, steps, operations, elements, components,and/or groups thereof.

As used herein, the term “if” may be construed to mean “when” or “upon”or “in response to determining” or “in accordance with a determination”or “in response to detecting,” that a stated condition precedent istrue, depending on the context. Similarly, the phrase “if it isdetermined [that a stated condition precedent is true]” or “if [a statedcondition precedent is true]” or “when [a stated condition precedent istrue]” may be construed to mean “upon determining” or “in response todetermining” or “in accordance with a determination” or “upon detecting”or “in response to detecting” that the stated condition precedent istrue, depending on the context.

The foregoing description of the present disclosure has been providedfor the purposes of illustration and description. It is not intended tobe exhaustive or to limit the disclosure to the precise forms disclosed.The breadth and scope of the present disclosure should not be limited byany of the above-described exemplary embodiments. Many modifications andvariations will be apparent to the practitioner skilled in the art. Themodifications and variations include any relevant combination of thedisclosed features. The embodiments were chosen and described in orderto best explain the principles of the disclosure and its practicalapplication, thereby enabling others skilled in the art to understandthe disclosure for various embodiments and with various modificationsthat are suited to the particular use contemplated. It is intended thatthe scope of the invention be defined by the following claims and theirequivalence.

What is claimed is:
 1. A method for tracking a target object, comprising: at a computing system having one or more processors and memory storing programs executed by the one or more processors: obtaining an image frame captured by an imaging device carried by an unmanned vehicle, the image frame containing the target object; extracting one or more features of the target object from the image frame, the target object being within a region selected by a user on the image frame; determining whether the target object is a predetermined recognizable object type based on a comparison of the one or more features with one or more characteristics associated with the predetermined recognizable object type; in accordance with a determination that the target object is the predetermined recognizable object type, initiating tracking functions provided in the computing system and associated with the predetermined recognizable object type; and in accordance with a determination that the target object does not belong to the predetermined recognizable object type, initiating tracking functions provided in the computing system and associated with a general object type, wherein the tracking functions associated with the general object type include: determining a spatial relationship between the target object and the unmanned vehicle using the one or more features, the spatial relationship including a horizontal distance between the target object and the unmanned vehicle; determining whether the horizontal distance is within a predetermined distance range; and in accordance with a determination that the horizontal distance is within the predetermined distance range, providing an option to the user to confirm to initiate an automatic tracking mode.
 2. The method of claim 1, wherein the tracking functions associated with the predetermined recognizable object type include adjusting one or more control parameters that control at least one of the spatial relationship between the unmanned vehicle and the target object, a movement of the unmanned vehicle, or a movement of a gimbal carried by the unmanned vehicle.
 3. The method of claim 2, wherein the one or more control parameters are generated in accordance with the one or more characteristics associated with the predetermined recognizable object type.
 4. The method of claim 2, wherein the one or more control parameters include at least one of a yaw angle movement of the unmanned vehicle, a translational movement of the unmanned vehicle, a velocity of the unmanned vehicle, or an acceleration of the unmanned vehicle.
 5. The method of claim 2, wherein the imaging device is coupled to the gimbal, and the one or more control parameters include at least one of a yaw angle movement of the gimbal or a pitch angle movement of the gimbal.
 6. The method of claim 2, wherein the tracking functions associated with the predetermined recognizable object type further include: determining the spatial relationship between the unmanned vehicle and the target object using at least the one or more characteristics associated with the predetermined recognizable object type.
 7. The method of claim 2, the tracking functions associated with the predetermined recognizable object type further include: determining whether a dimension of the target object displayed in the image frame is above a predetermined threshold value; in accordance with a determination that the dimension of the target object is above or equal to the predetermined threshold value, determining the spatial relationship between the unmanned vehicle and the target object; and in accordance with a determination that the dimension of the target object is below the predetermined threshold value, initiating the tracking functions associated with the general object type.
 8. The method of claim 2, wherein: the spatial relationship includes a pitch angle between the unmanned vehicle and the target object; and the tracking functions associated with the predetermined recognizable object type further include determining whether the pitch angle is lower than a predetermined value.
 9. The method of claim 8, wherein the tracking functions associated with the predetermined recognizable object type further include: in accordance with a determination that the pitch angle is lower than the predetermined value: sending a warning indication to the user; adjusting, or allowing the user to adjust, the one or more control parameters of the unmanned vehicle such that an updated pitch angle is equal to or above the predetermined value; and obtaining one or more subsequent image frames for determining the updated pitch angle, wherein the pitch angle is determined using a pitch angle of the gimbal carried by the unmanned vehicle and a target pitch angle of the target object displayed on the image frame.
 10. The method of claim 8, wherein the tracking functions associated with the predetermined recognizable object type further include: in accordance with a determination that the pitch angle is higher than or equal to the predetermining value, sending a request to the user to confirm to initiate the automatic tracking mode.
 11. The method of claim 2, wherein the tracking functions associated with the predetermined recognizable object type further include: in response to identifying the target object to be a human, recognizing one or more body gestures of the target object; and adjusting the one or more control parameters of the unmanned vehicle in accordance with the one or more body gestures of the target object.
 12. The method of claim 2, wherein the tracking functions associated with the predetermined recognizable object type further include: in response to identifying the target object to be a human, performing machine learning to obtain one or more personal features of the target object, the one or more personal features being used for automatically tracking the target object by the unmanned vehicle.
 13. The method of claim 1, wherein determining whether the target object is the predetermined recognizable object type includes: providing one or more candidate recognizable object types for user selection, the one or more candidate recognizable object types being identified based on the comparison between the one or more features and one or more characteristics associated with the one or more candidate recognizable object types; and receiving a user input indicating the target object is the predetermined recognizable object type selected from the one or more candidate recognizable object types.
 14. The method of claim 1, wherein the tracking functions associated with the general object type further include: in accordance with a determination that the horizontal distance is not within the predetermined distance range: sending a warning indication to the user; adjusting, or allowing the user to adjust, one or more control parameters of the unmanned vehicle such that an updated spatial relationship between the unmanned vehicle and the target object becomes suitable for initiating an automatic tracking mode; and obtaining one or more subsequent image frames for determining the updated spatial relationship between the unmanned vehicle and the target object.
 15. The method of claim 1, wherein the image frame is a first image frame; the method further comprising: obtaining a second image frame captured by the imaging device; and in accordance with a determination that the target object is not included in the second image frame: identifying one or more candidate target objects from the second image frame that belong to the predetermined recognizable object type, the one or more candidate target objects being identified based on comparison between one or more features extracted from the one or more candidate target objects respectively and the one or more characteristics associated with the predetermined recognizable object type; determining whether the one or more features of the one or more candidate target objects fit a target object model generated based on the one or more features of the target object; and in accordance with a determination that one or more features of one of the one or more candidate target objects fit the target object model, initiating tracking operations associated with the target object.
 16. The method of claim 1, wherein the predetermined distance range is determined in accordance with a height of the unmanned vehicle relative to the ground.
 17. The method of claim 1, wherein the one or more characteristics associated with the predetermined recognizable object type include at least one of a category, a shape, a pattern, a length range, a width range, a height range, a speed range, or an acceleration range of one or more objects of the predetermined recognizable object type.
 18. The method of claim 1, wherein the one or more features of the target object include one or more dimensional features displayed on the image frame, the one or more dimensional features including a shape, a pattern, a length, a width, or a height of the target object.
 19. The method of claim 18, further comprising: generating a bounding box based on the one or more dimensional features of the target object for defining the target object on the image frame.
 20. The method of claim 19, wherein the image frame is a first image frame; the method further comprising: obtaining a second image frame including the target object captured by the imaging device, wherein the one or more features of the target further include a speed and an acceleration of the target object calculated based on the one or more dimensional features of the target object. 